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SETTING  UP  R  EDITOR 

Exercise:  1 
Date:  14.11.2019 
Aim: 

To  know  how  to  setup  the  R  editor 

Steps: 

Step  1 :  Open  R  software 

Step  2:  Select  file  ->new  script 

Step  3:  Select  windows  ->  Tile  vertically. 

R  Editor: 


Interpretation: 

We  learned  the  footsteps  to  setup  the  R  editor. 
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MANAGING  WORKING  DIRECTORY 

Exercise:  2 
Date:  14.11.2019 
Aim: 

To  know  how  to  manage  the  current  working  directory 

#  To  get  the  current  working  Directory 
R  Code  and  Output: 

>  getwd() 

[1]  nC:  /Users/ugst4S/Desktoprr 


#Changing  the  Current  Working  Directory 

R  Code  and  Output: 

>  setwd(nD:n) 

>  getwd  ( )■ 

II]  "D:/" 

>  setwd ( nC : /U3er3/ug3t46/De3ktop" ] 

>  getwd  ( )■ 

Jl]  "C:  /Users/ugst46/Desktop'r 

> 


Interpretation: 


We  learned  the  footsteps  to  manage  the  current  working  directory 
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DATA  TYPES  IN  R:  VECTOR 


Exercise:  3 
Date:  14.11.2019 
Aim: 


To  know  how  to  create  various  types  of  vectors  in  R 

R  Code  and  Output: 

1  >  # - NUMERIC  VECTOR - # 

>  age  <-  c  (26, -96,45,48,49,34,32,64) 

>  age 

[1]  26  46  45  45  49  34  32  64 

>  £ - CHARACTER  VECTOR - # 

>  gender  <-  c ( wHwr wFwr wMwr wFwr wF"r wMwr wMwr WFW ) 

>  gender 

[1]  rrMrr  rrFrr  rrMrr  rrFrr  rrFrr  "M"  rrMpr  rrFrr 

>  # - LOGICAL  VECTOR - # 

>  genderl  <-  c  (TRUE f  FALSE f  TRUE f  FALSE f  FALSE f  TRUE f  TRUE f  FALSE) 

>  genderl 

[1]  TRUE  FALSE  TRUE  FALSE  FALSE  TRUE  TRUE  FALSE 

>  # - VECTORS  USING  INBUILT  FUNCTION - # 

>  id  <-  seq(lr3) 

>  idl  <-  seq(IrI6f2) 

>  id2  <-  seq(2r24(r3) 

>  id 

[1]  12345  6  75 

>  idl 

[1]  1  3  5  7  9  11  13  15 

>  id2 

[1]  2  5  5  11  14  17  2G  23 

>  values  <-  rep(10r3) 

>  values 

[1]  10  10  10  10  10  10  10  10 

>  character  <-  rep  (c  (  n yesn  f  rinon)f3) 

>  character 

[1]  rryesrr  rrnorr  rryesrr  rTnopr  ppyespr  PTnorr  "yes"  "no"  "yes"  "no"  "yes"  "no" 

[13]  "yes"  "no"  "yes"  "no" 

>  rand  <-  runif (3) 

>  rand 

[1|  0.69208971  0.25926773  0.86212513  0.56781413  0.92371753  0.77109469  0 . 02762S34 
[8]  0.40553547 

>  randl  <-  runif (3r 20r 30) 

>  randl 

[13  25.85852  28.46778  26.36216  27.81317  23.69519  28.16213  25.71561  28.88007 

>  rand2  <-  round (runif (3 r 20f 30) r 0) 

>  rar.d2 

[1]  22  24  24  26  23  24  28  24 

>  numb  <-  numeric (8) 

>  n  umb 

[1]  00000000 
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Interpretation: 

We  learned  the  formation  of  different  types  of  vectors  in  R 


DATA  FRAME 


Exercise:  4 

Date:  14.11.2019 


Aim: 


To  know  how  to  create  a  data  frame  in  R  and  to  view  a  specific  element  of  a  dataset. 


R  Code  and  Output: 


>  t 


■DATA  FRAME 


i 


>  patiertld  <-  c (1,2, 3,4,5, 6,7,8,9,10) 

>  age  <-  c|24, 46, 54, 53, 46, 70, 58, 64, 69, 42) 

>  gender  <-  c ( "M" f "M" f "Fn  f "M" F "F" F "F" F "F" f "M" f "M" f "F" ) 

>  diabetetype  <-  c("Typel",  "Type2",  "Type2",  "Type2",  "Typel",  "Type!",  "Typel1 

>  status  <-  c ( "Poor " f  "Improved",  "Excellent" F  "Poor",  "Improved",  "Excellent", 

>  feespaid  <-  c (200,100,100,200,100,200,100,150,250,150) 

>  patier.tdata  <-  data. frame (patientld, age, gender, diabetetype, status, feespaid) 

>  fix  (patier.tdata) 

>1 


,  "Type2",  "Type2",  "Typel*) 

"Improved",  "Improved",  "Poor",  "Poor") 
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Interpretation: 


We  learnt  the  coding  how  to  create  a  data  frame  in  R  and  to  view  a  specific  element  of  a  dataset. 
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MATRIX 


Exercise:  5 

Date:  19.11.2019 


R  Code  and  Output: 


>  values<-c (1F 2 F 3F 4F 5F 6F 7F 3F 9) 

>  A< -matrix  ( values Fr.row=3Fr.col=3Fbyrow=  FALSE) 

>  B< -matrix  (value3Fnrow=3Fncol=3Fbyrow=TRDE} 

>  A 

[,1]  1,2]  I f  3 ] 

[1F]  147 

[2F  ]  2  5  3 

[3f]  3  6  9 

>  B 


1,1]  1,2]  [  #■  3 ] 

[1F]  123 

[2f  ]  4  5  6 

[3f]  7  S  9 


Matrix  Operation: 
R  Program: 


>  mlvalues<-c (7F3F4F6F9F2F4F6F1) 

>  x<-matrix (mlvalue3Fnrow=3Fncol=3Fbyrow=T) 

>  m2values<-c (4F6F4F7F3F9F5F1F0) 

>  y< -matrix  (m2valuesFnrow=3FncoI=3F byrow=T) 

>  m3values<-c (3r 1F 0f 6f 4 F 3 F 1F 6) 

>  z< -matrix ( m3 va lue s F  nr ow=4  F nc □ 1=2  F  by r ow=T ) 

>  x 

[,1]  21  1,33 

[1F]  734 

[2f]  6  9  2 

[3F]  4  6  1 


>  V 

1,1]  1,21  [ F  3 ] 

[1,3  4  6  4 

[2,  ]  7  3  9 

13,]  510 
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>  z 

1,13  1,2] 

[1,3  3  1 

[2,3  0  6 

[3,3  4  8 

[4,3  1  6 


>  # - TRANSPOSE  OF  A  MATRIX - # 

>  t  (x) 


[ 

,1] 

[ ,  2  ] 

[ ,  3  3 

[1, 1 

7 

6 

4 

[2,  1 

8 

9 

6 

[3,  ] 

>  * - 

>  x.-*-y 

4 

2 

1 

--  ADDITION  OF 

A  MATRIX  - 

I 

,13 

[  ,  2  ] 

[,33 

[1,  1 

11 

14 

8 

[  2 ,  3 

13 

17 

11 

[3,  1 

>  # - 

>  x-y 

9 

7 

1 

--  SUBTRACTION 

OF  A  MATRIX  - 

I 

,13 

1,23 

[,33 

[1,  3 

3 

2 

0 

[2,  1 

-1 

1 

-7 

[3,  3 

-1 

5 

1 

>  # - 

--  MULTIPLICATION  OF  A  MATRIX 

>  x%*%y 

I 

,13 

[ ,  2  ] 

[,33 

[1,  I 

104 

110 

100 

[2,  3 

97 

110 

105 

[3,  3 

63 

73 

70 

>  * - DETERMINANT  OF  A  MATRIX - # 


>  det (x ) 
[1]  -5 


>  # - INVERSE  OF  A  MATRIX - # 

>  solve (x) 

[,1]  [,2]  1,3] 

[1,3  0.6  -3.2  4 

[2,3  -0.4  1.3  -2 

[3,3  0.0  2.0  -3 

>  # - El  SEN  VALUES  AND  EL  SEN  VECTORS - # 

>  eigen  (x.) 

eigen  O  decoir.position 
$veiues 

[13  16.8037610  0.6523576  -0.4561186 


$ vectors 

[,U 

[1,]  -0.6715278 
[2, ]  -0.6202189 
[3,3  -0.4054367 


1,21 

-0.8151865 

0.5299108 

0.2338065 


[,  3] 

-0.6686285 

0.2784185 

0.6895064 
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>  ft - MATRIX  AND  ITS  FUNCTIONS  - - ft 

>  mvaiues  <-  c(7F8F2F4FSF5F6F7F4) 

>  matA  <-  matrix (mvaiues ,  nrow=3Fncol=3Fbyrow=Ff dimnames  =  list (c ( nXn f nYn r nZn ) f c ( nAn f nBn f nCn ) ) ) 

>  matA 


A  E  C 
X  7  4  6 
Y  8  9  7 
2  2  5  4 


>  mlvalues  <-  c(2f3f4f5f^f3f4f6,5) 

>  mats  <-  matrix  (mlvalues  f  nrow=3  rnco  1=3 Fbyrow=Tr  dimnames  =  list  (c  (WX"F  TF  "Zw}  r  c  (WAWF  "B"  r  "Cw ) } } 

>  matB 


ABC 
X  2  3  4 
Y  5  4  3 


2  4  6  5 


>  matA[lrI  ft  to  get  the  first  row  elements  of  the  matrix 
ABC 

7  4  6 

>  matA[r3I  ft  to  get  the  third  column  elements  of  the  matrix 
X  Y  2 

6  7  4 

>  matA [ 2 , 31  ft  to  get  the  second  row  and  third  column  element  of  the  matrix 
[1]  7 

>  matA[lr c [2 r 3) ]  ft  to  get  the  first  row  and  second  and  third  column  elements  of  the  matrix 
B  C 

4  6 


>  matA[-lrJ  ft  to  get  the  all  elements  except  the  first  row  elements 
ABC 

Y  8  9  7 
2  2  5  4 

> 


Interpretation: 


We  learnt  the  R  coding  to  create  matrices  and  how  to  perform  various  operations  in  matrices. 
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IMPORTING  AND  EXPORTING  DATA  FILES 

Exercise:  6 
Date:  22.11.2019 
Aim: 

To  know  how  to  Import  and  Export  data  files  in  R. 

R  Code  for  Importing  Data: 

>  ray  data.  <-  read,  table  (WU:  /ugst46/Bookl .  C37wr  header  =  Tf  sep=wrw) 

>  mydata 


R  Output  for  Importing  Data: 


>  mydata 


5. No  Mean. 

.arterial .blood. pressure 

Age 

Weight 

Body . Surface  Heart . 

.  Beat 

1 

1 

105 

47 

85.4 

1.75 

63 

2 

2 

115 

49 

94 . 2 

2.10 

70 

3 

3 

116 

49 

95.3 

1. 98 

72 

4 

4 

117 

50 

94.7 

2.01 

73 

5 

5 

112 

51 

89.4 

1.89 

72 

6 

6 

121 

48 

9  9.5 

2 . 25 

71 

7 

7 

121 

49 

99.8 

2.25 

69 

8 

8 

110 

47 

90 . 9 

1. 90 

66 

9 

B 

110 

49 

89.2 

09 

i— 1 

6  9 

10 

10 

114 

48 

97.7 

2.07 

64 

>  I 


R  Code  for  Exporting  Data: 


>  write . table jmtcar s f WU: /ugst46/data . csvwr  sep=nfn) 

>  f Lx  (ratcars) 


AJEETH.H  |  17-UST-046  |  PG  &  RESEARCH  DEPARTMENT  OF  STATISTICS,  LOYOLA  COLLEGE,  CHENNAI-34 


11 


R  Output  for  Exporting  Data: 


row.nsrr.es 

irpg 

cyl 

disp 

hp 

drat 

wt 

qsec 

vs 

air. 

gear 

carb 

1 

Mazda  RX4 

21 

6 

160 

110 

3.9 

2.62 

16.46 

0 

1 

4 

4 

2 

Mazda  RX4  Wag 

21 

6 

160 

110 

3.9 

2.875 

17.02 

0 

1 

4 

4 

3 

Datsun  710 

22.8 

4 

108 

93 

3.85 

2.32 

18.61 

1 

1 

4 

1 

4 

Hornet  4  Drive 

21.4 

6 

258 

110 

3.08 

3.215 

19.44 

1 

0 

3 

1 

5 

Hornet  Sport about 

18.7 

8 

360 

175 

3.15 

3.44 

17.02 

0 

0 

3 

2 

€ 

Valiant 

18.1 

6 

225 

105 

2.76 

3.46 

20.22 

1 

0 

3 

1 

7 

Duster  360 

14.3 

8 

360 

245 

3.21 

3.57 

15.84 

0 

0 

3 

4 

8 

Merc  240D 

24.4 

4 

146.7 

62 

3 . 69 

3.19 

20 

1 

0 

4 

2 

5 

Merc  230 

22.8 

4 

140.8 

95 

3.92 

3.15 

22.9 

1 

0 

4 

2 

10 

Merc  280 

19.2 

6 

167.6 

123 

3.92 

3.44 

18.3 

1 

0 

4 

4 

11 

Merc  280C 

17.8 

6 

167.6 

123 

3 . 92 

3.44 

18.9 

1 

0 

4 

4 

12 

Merc  45G5E 

16.4 

8 

275.8 

180 

3.07 

4.07 

17.4 

0 

0 

3 

3 

13 

Merc  450SL 

17.3 

8 

275.8 

180 

3.07 

3.73 

17.6 

0 

0 

3 

3 

14 

Merc  45Q5LC 

15.2 

8 

275.8 

180 

3.07 

3.78 

18 

0 

0 

3 

3 

15 

Cadillac  Fleetwood 

10.4 

8 

472 

205 

2.93 

5.25 

17.98 

0 

0 

3 

4 

16 

Lincoln  Continental 

10.4 

8 

460 

215 

3 

5.424 

17.82 

0 

0 

3 

4 

17 

Ciirysler  Iir.perial 

14.7 

8 

440 

230 

3.23 

5.345 

17.42 

0 

0 

3 

4 

18 

Fiat  128 

32.4 

4 

78.7 

66 

4.08 

2.2 

19.47 

1 

1 

4 

1 

19 

Honda  Civic 

30.4 

4 

75.7 

52 

4. 93 

1. 615 

18.52 

1 

1 

4 

2 

20 

Toyota  Corolla 

33.9 

4 

71.1 

65 

4.22 

1.835 

19.9 

1 

1 

4 

1 

21 

Toyota  Corona 

21.5 

4 

120.1 

97 

3.7 

2.465 

20.01 

1 

0 

3 

1 

22 

Dodge  Challenger 

15.5 

8 

318 

150 

2.76 

3.52 

16.87 

0 

0 

3 

2 

23 

AMC  Javelin 

15.2 

8 

304 

150 

3.15 

3.435 

17.3 

0 

0 

3 

2 

24 

Camaro  Z28 

13.3 

8 

350 

245 

3.73 

3.84 

15.41 

0 

0 

3 

4 

25 

Pontiac  Firebird 

19.2 

8 

400 

175 

3.08 

3.845 

17.05 

0 

0 

3 

2 

26 

Fiat  XI -9 

27.3 

4 

79 

66 

4.08 

1.935 

18.9 

1 

1 

4 

1 

27 

Porsche  914-2 

26 

4 

120.3 

91 

4.43 

2.14 

16.7 

0 

1 

5 

2 

28 

Lotus  Europa 

30.4 

4 

95.1 

113 

3.77 

1.513 

16.9 

1 

1 

5 

2 

Interpretation: 


We  learnt  the  R  coding  for  Importing  and  Exporting  Data  files. 
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CREATING  NEW  VARIABLES  USING  MATHEMATICAL  OPERATOR 
Exercise:  7 
Date:  26.11.2019 
Aim: 


To  know  how  to  create  new  variables  using  mathematical  operator  in  R. 

R  Code  and  Output  for  Creating  new  variable: 

>  patientid  <-  c{lF2F 3F 4r 5F 6F 7F 3r 9F 10} 

>  bmi  <-  c{23.4F2S.5F21F23.6F27.8F20.5FlS.5F26.5F23.9F23.6) 

>  age  <  -  c(25r43r51r29r45F6GF23F52F59r23} 

>  height  <-  c  (142 r 164 F 164 F 177r 137, 146, 153r 153, 147, 149) 

>  weight  <-  c (78, 77 F 67, 49 F 79 F 74 F 67 F 64, 61, 33} 

>  patier.tdetaiis  <-  data .  frame  (patier.tid,  bmif  age,  height ,  weight) 

>  fix  (patier.tdetaiis) 

>  # - CREATING  NEW  VARIABLES  USING  MATHEMATICAL  OPERATOR - # 

>  # - MATHEMATICAL  OPERATORS  ARE  - # 

> 

>  patientdetails$hwr  <-  patier.tdetails$height/patier.tdetails$weight 

>  patientdetails$hwr 

[1]  1 . 320513  2.12987G  2 . 447761  3.612245  2.367089  1.972973  2.283582  2.468750 
[9]  2.409836  1.693182 

>  patientdetails$age_inmonths  <-  patier.tdetails$age*12 

>  patiertdetaiIs$age_iniuonths 

[1]  300  576  612  348  540  720  276  624  708  276 

>  patientdetailsl  <-  cbird (patientdetails,patientdetail3$hwr,patientdetail3$age_innionths) 

>  patientdetailsl 


pat lent id 

fcrc.i 

age 

height 

weight 

hwr 

age  inir.Gnths 

pat lent detailsShwr 

1 

1 

23.4 

25 

142 

78 

1.320513 

300 

1.320513 

2 

2 

25.5 

43 

164 

77 

2.129370 

576 

2.129870 

3 

3 

21.0 

51 

164 

67 

2.447761 

612 

2.447761 

4 

4 

23.6 

29 

177 

49 

3.612245 

343 

3.612245 

5 

5 

27.8 

45 

137 

79 

2.367089 

540 

2.367089 

6 

6 

20.5 

60 

146 

74 

1.972973 

720 

1.972973 

7 

7 

19.5 

23 

153 

67 

2.283532 

276 

2.283582 

8 

8 

26.5 

52 

158 

64 

2.468750 

624 

2.468750 

9 

9 

28.9 

59 

147 

61 

2.409836 

70S 

2.409836 

10 

10 

28.6 

23 

149 

88 

1.693132 

276 

1.693182 

p  a  1 1  ent  de  t  a  i  1  s  $  a  ge_i  nraonths 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

i 


300 

576 

612 

348 

540 

720 

276 

624 

708 

276 
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R  -ta  Editor  I  1=1  II  0  II  £3 


pat lent id 

bmi 

age 

height 

weight 

vai6 

vaz7 

vaz8 

1 

1 

23.1 

25 

112 

78 

2 

2 

25.5 

18 

161 

77 

3 

3 

21 

51 

161 

67 

1 

1 

2  3.6 

29 

177 

19 

5 

5 

27. S 

15 

187 

79 

6 

6 

20.5 

60 

116 

71 

7 

7 

19.5 

23 

153 

67 

s 

8 

26.5 

52 

158 

61 

9 

9 

28.9 

5  9 

117 

61 

10 

10 

28.6 

23 

119 

88 

11 

12 

13 

11 

15 

16 

17 

18 

19 

Interpretation: 

We  learnt  the  R  coding  for  creating  new  variables  using  mathematical  operator. 
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CREATING  NEW  VARIABLES  USING  CONDITIONAL  STATEMENT 
Exercise:  8 
Date:  26.11.2019 
Aim: 

To  know  how  to  create  new  variables  using  conditional  statements  If  Else  and  If  Else  If  in  R. 

R  Code  and  Output: 


>  f - creating  new  variable  using  conditional  statement - £ 

>  # - creating  new  variable  using  conditional  IF  ELSE  statement - # 

>  patientdetails$status  <-  ifelse (patientdetails$age  >=  53,  "Senior  Citizen",  "Youngsters") 

>  patientdetails  <-  cbind(patier.tdetails,  patier.tdetails$status) 

>  patientdetails 


patientid 

hiri 

age  height  weight 

hwi 

age  iniaonths 

status 

1 

1 

2  3.4 

25 

112 

78 

1.820513 

300 

Youngsters 

2 

2 

25.5 

48 

161 

77 

2.129870 

576 

Youngsters 

3 

3 

21.0 

51 

161 

67 

2.117761 

612 

Youngsters 

4 

23.6 

29 

177 

19 

3.612215 

318 

Youngsters 

5 

5 

27.8 

15 

187 

79 

2.367089 

510 

Youngsters 

6 

6 

20.5 

60 

116 

71 

1.972973 

720 

Senior  Citizen 

7 

7 

19.5 

23 

153 

67 

2.283582 

276 

Youngsters 

8 

8 

26.5 

52 

158 

61 

2.168750 

621 

Youngsters 

9 

9 

28.9 

59 

117 

61 

2.109836 

708 

Senior  Citizen 

10 

10 

28.6 

23 

119 

88 

1.693182 

276 

Youngsters 

patient id 

bmi 

age 

height 

weight 

hwr 

age  inmonths 

status 

patientdetails$status 

patientdetails$status 

bmi  Groups 

i 

1 

23.4 

25 

142 

78 

1.820513 

300 

Youngsters 

Youngsters 

Youngsters 

normal 

2 

2 

25.5 

48 

164 

77 

2.12987 

576 

Youngsters 

Youngsters 

Youngsters 

overweight 

3 

3 

21 

51 

164 

67 

2.447761 

612 

Youngsters 

Youngsters 

Youngsters 

normal 

4 

4 

23.6 

29 

177 

49 

3.612245 

348 

Youngsters 

Youngsters 

Youngsters 

normal 

5 

5 

27.8 

45 

187 

79 

2.367089 

540 

Youngsters 

Youngsters 

Youngsters 

overweight 

6 

6 

20.5 

60 

146 

74 

1.972973 

720 

Senior  Citizen 

Senior  Citizen 

Senior  Citizen 

normal 

7 

7 

19.5 

23 

153 

67 

2.283582 

276 

Youngsters 

Youngsters 

Youngsters 

normal 

S 

8 

26.5 

52 

158 

64 

2.46875 

624 

Youngsters 

Youngsters 

Youngsters 

overweight 

9 

9 

28.9 

59 

147 

61 

2.409836 

708 

Senior  Citizen 

Senior  Citizen 

Senior  Citizen 

overweight 

10 

10 

28.6 

23 

149 

88 

1.693182 

276 

Youngsters 

Youngsters 

Youngsters 

overweight 

11 

12 

Interpretation: 

To  know  how  to  create  new  variables  using  conditional  statements  If  Else  and  If  Else  If  in  R 
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CREATING  INDICATOR  VARIABLES  USING  CONDITIONAL  STATEMENT 
Exercise:  9 
Date:  26.11.2019 


Aim: 


To  know  how  to  create  Indicator  variables  using  conditional  statements 

R  Code  and  Output: 


>  # - creating  Indicator  variables - $ 

>  patientdetails$normal  <-  if else (patientdetails$bmi_Groups  ==c ("Normal") ,1,0) 

>  patientdetails?overweight<-ifelse  (patier.tdetails$brai_Groups==c  ("overweight")  ,1,0) 

>  patier.tdetails?Underweight<-ifelse (patientdetails$bmi_Sroups==c ("Underweight") ,1,0) 

>  fix  (patier.tdetails) 


patientid 

bmi 

age 

height 

weight 

hwr 

age  inmonths 

status 

patientdetails$status 

patientdetails$status 

bmi  Groups 

normal 

overweight 

Underweight 

1 

1 

23.4 

25 

142 

78 

1.820513 

300 

Youngsters 

Youngsters 

Youngsters 

normal 

0 

0 

0 

2 

2 

25.5 

48 

164 

77 

2.12987 

576 

Youngsters 

Youngsters 

Youngsters 

overweight 

0 

1 

0 

3 

3 

21 

51 

164 

67 

2.447761 

612 

Youngsters 

Youngsters 

Youngsters 

normal 

0 

0 

0 

4 

4 

23.6 

29 

177 

49 

3.612245 

348 

Youngsters 

Youngsters 

Youngsters 

normal 

0 

0 

0 

5 

5 

27.8 

45 

187 

79 

2.367089 

540 

Youngsters 

Youngsters 

Youngsters 

overweight 

0 

1 

0 

6 

6 

20.5 

60 

146 

74 

1.972973 

720 

Senior  Citizen 

Senior  Citizen 

Senior  Citizen 

normal 

0 

0 

0 

7 

7 

19.5 

23 

153 

67 

2.283582 

276 

Youngsters 

Youngsters 

Youngsters 

normal 

0 

0 

0 

8 

8 

26.5 

52 

158 

64 

2.46875 

624 

Youngsters 

Youngsters 

Youngsters 

overweight 

0 

1 

0 

9 

9 

28.9 

59 

147 

61 

2.409836 

708 

Senior  Citizen 

Senior  Citizen 

Senior  Citizen 

overweight 

0 

1 

0 

10 

10 

28.6 

23 

149 

88 

1.693182 

276 

Youngsters 

Youngsters 

Youngsters 

overweight 

0 

1 

0 

11 

12 
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>  marks  <-  read,  table  (file  =  "clipboard"  r  sep="\tn(r  header  =  TRUE) 

>  marks 

Dep t .  No  XI s t .  s eir.  X2nd .  s err.  Ave rage 


1 

1 

45 

67 

56.0 

2 

2 

78 

74 

76.0 

3 

3 

54 

62 

58.0 

4 

4 

53 

64 

61.0 

5 

5 

59 

60 

59.5 

6 

6 

68 

63 

68.0 

7 

7 

67 

69 

68.0 

8 

8 

62 

64 

63.0 

9 

9 

88 

90 

39.0 

10 

10 

90 

92 

91.0 

Data  Editor 


© 


Dept .No 

Xlst .  seir. 

X2nd.  serr. 

Average 

avgclass 

var  6 

var7 

1 

1 

45 

67 

56 

Second  class 

2 

2 

78 

74 

76 

First  class 

3 

3 

54 

62 

58 

Second  class 

4 

4 

58 

64 

61 

First  class 

5 

5 

59 

60 

59.5 

Second  class 

6 

6 

68 

68 

63 

First  class 

7 

7 

67 

69 

68 

First  class 

8 

8 

62 

64 

63 

First  class 

9 

9 

88 

90 

89 

Distinction 

10 

10 

90 

92 

91 

Distinction 

11 

12 

13 

14 

15 

16 

17 

13 

19 

TTTT — TT.TT-iT  i“-  7,  T  — T  T7.  T*  T  7,  “■  T  PT 
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>  # - CREATING  INDICATOR  VARIABLES - # 

>  marlc$distinction<-if  else  (marJc$avgclass=c  (  "Distinction1,1 )  r  lr  0) 

>  mark$firstclass<-ifelse  fmarJi:$avgcla3  3==c  ("First  clss" ) r 0) 

>  mark^seconddassC-if  else  (mark:$avgclass=c  ( wSecond  class")  r  1  r  0 ) 

>  mark 


Dept. No  Xlst, 

.  seir.  X2nd. 

,  seir.  Average 

avgclass 

distinction 

f irstclass 

1 

1 

45 

67 

56.0 

Second 

class 

0 

0 

2 

2 

78 

74 

76.0 

First 

class 

0 

0 

3 

3 

54 

62 

58.0 

Second 

class 

0 

0 

4 

4 

58 

64 

61.0 

First 

class 

0 

0 

5 

5 

59 

60 

59.5 

Second 

class 

0 

0 

6 

6 

68 

68 

68.0 

First 

class 

0 

0 

7 

7 

67 

69 

68.0 

First 

class 

0 

0 

8 

8 

62 

64 

63.0 

First 

class 

0 

0 

9 

9 

88 

90 

89.0 

Distinction 

1 

0 

10 

10 

90 

92 

91.0 

Distinction 

1 

0 

secondclass 


1  1 

2  0 

3  1 

4  0 

5  1 

6  0 

7  0 

8  0 

9  0 

10  0 


Interpretation: 

We  learnt  the  R  coding  for  selecting  random  samples  from  the  dataset. 
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SORTING  DATASET  IN  ASCENDING  AND  DESCENDING  ORDER 
Exercise:  10 
Date:  30.11.2019 


Aim: 

To  know  how  to  sort  dataset  in  ascending  and  descending  order  in  R 

R  Code  and  Output  for  sorting  in  Ascending  order: 

>  # - SORTING  IN  ASCENDING  ORDER - # 

>  marlcl<-  marie  [  order  (marJr$Xist .  sem)  f  J 

>  ±Lx  (mark!) 


Data  Editor 


row.nair.es 

Dept .No 

Xlst .  seir. 

X2:nd.  seir. 

Average 

avgclass 

distinction 

1 

1 

1 

45 

67 

56 

Second  class 

0 

2 

3 

3 

54 

62 

53 

Second  class 

0 

3 

4 

4 

55 

64 

61 

First  class 

0 

4 

5 

5 

55 

60 

55.5 

Second  class 

0 

5 

8 

8 

62 

64 

63 

First  class 

0 

6 

7 

7 

67 

65 

68 

First  class 

0 

7 

6 

6 

63 

63 

68 

First  class 

0 

8 

2 

2 

78 

74 

76 

First  class 

0 

5 

5 

5 

88 

50 

85 

Distinction 

1 

10 

10 

10 

50 

52 

51 

Distinction 

1 

11 

12 

13 

14 

15 

16 

17 

18 

15 

<  > 
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R  Code  and  Output  for  sorting  in  Descending  order: 


>  # - SORTING  IN  DESCENDING  ORDER - # 

>  mark2<-  mark  [order  (-markSXlst .  sem)  r  I 

>  fix (marks ) 


iow -names 

Dept .No 

Xlst .  seir. 

X2nd.  seir. 

Average 

avgclass 

distinction 

f irstclass 

secondclass 

1 

10 

10 

90 

92 

91 

Distinction 

1 

0 

0 

2 

5 

9 

88 

90 

89 

Distinction 

1 

0 

0 

3 

2 

2 

78 

74 

76 

First  class 

0 

0 

0 

4 

6 

6 

68 

68 

68 

First  class 

0 

0 

0 

5 

7 

7 

67 

69 

68 

First  class 

0 

0 

0 

6 

8 

8 

62 

64 

63 

First  class 

0 

0 

0 

7 

5 

5 

5  9 

60 

59.5 

Second  class 

0 

0 

1 

8 

4 

4 

58 

64 

61 

First  class 

0 

0 

0 

9 

3 

3 

54 

62 

58 

Second  class 

0 

0 

1 

10 

1 

1 

45 

67 

56 

Second  class 

0 

0 

1 

11 

12 

R  Code  and  Output  for  sorting  dataset  with  2  variables: 


>  # - SORTING  DATASET  WITH  RESPECT  TO  2  VARIABLES - # 

>  itLarJc3<-iiiar]c[ order  (mar k$Xlst .  semrHiarlc$X2nd.  sem)  r  ] 

>  fix  (iuarlc3) 


row. names 

Dept .No 

Xlst .  seir. 

X2nd.  seir. 

Average 

avgclass 

distinction 

f irstclass 

secondclass 

1 

1 

1 

45 

67 

56 

Second  class 

0 

0 

1 

2 

3 

3 

54 

62 

58 

Second  class 

0 

0 

1 

3 

4 

4 

53 

64 

61 

First  class 

0 

0 

0 

4 

5 

5 

59 

60 

59.5 

Second  class 

0 

0 

1 

5 

8 

3 

62 

64 

63 

First  class 

0 

0 

0 

6 

7 

7 

67 

69 

63 

First  class 

0 

0 

0 

7 

6 

6 

68 

68 

68 

First  class 

0 

0 

0 

8 

2 

2 

78 

74 

76 

First  class 

0 

0 

0 

9 

9 

9 

38 

90 

39 

Distinction 

1 

0 

0 

10 

10 

10 

90 

92 

91 

Distinction 

1 

0 

0 

11 

12 

Interpretation: 

We  learnt  the  R  coding  how  to  sort  dataset  in  ascending  and  descending  order  in  R 
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DROP  AND  KEEP  VARIABLES 

Exercise:  11 
Date:  30.11.2019 
Aim: 

To  know  how  to  drop  and  keep  variables  in  R 

R  Code  and  Output: 


>  # - DROP  AND  KEEP  VARIABLES  IN  DATASET - 

>  # - DROP  VARIABLES  IN  DATASET - 

>  mtcars 

ir.pg  cyl  disp  hp  drat-  wt  qsec 

vs 

air. 

— * 

— # 

gear 

cart 

Mazda  RX1 

21 . 0 

6 

160 . 0 

110 

3 . 90 

2 . 620 

16.16 

0 

1 

1 

1 

Mazda  RX1  Wag 

21 . 0 

6 

160 . 0 

110 

3 . 90 

2  .  S75 

17 . 02 

0 

1 

1 

1 

Datstin  710 

22  .  S 

1 

10S  .  0 

93 

3 . 55 

2 . 320 

IS  .  61 

1 

1 

1 

1 

Hornet  1  Drive 

21 . 1 

6 

25S  .  0 

110 

3  .  OS 

3 . 215 

19 . 11 

1 

0 

3 

1 

Hornet  Sportabout 

IS  .  7 

S 

360 . 0 

175 

3 . 15 

3 . 110 

17 . 02 

0 

0 

3 

2 

Valiant 

IS  .  1 

6 

225 . 0 

105 

2 . 76 

3 . 160 

20 . 22 

2 

0 

3 

1 

Duster  360 

11 . 3 

S 

360 . 0 

215 

3 . 21 

3 . 570 

15  .  SI 

0 

0 

3 

1 

Me  rc  2  1  0  D 

21 . 1 

1 

116 . 7 

62 

3 . 69 

3 . 190 

20 . 00 

1 

0 

1 

2 

Merc  230 

22  .  S 

1 

110  .  S 

95 

3 . 92 

3 . 150 

22 . 90 

1 

0 

1 

2 

Merc  2S0 

19 . 2 

6 

167 . 6 

123 

3 . 92 

3 . 110 

IS  .  30 

1 

0 

1 

1 

Merc  2S0C 

17  .  S 

6 

167 . 6 

123 

3 . 92 

3 . 110 

IS  .  90 

1 

0 

1 

1 

Merc  150SE 

16 . 1 

S 

275  .  S 

ISO 

3 . 07 

1 . 070 

17 . 10 

0 

0 

3 

3 

Merc  150SL 

17 . 3 

8 

275  .  S 

ISO 

3 . 07 

3 . 730 

17 . 60 

0 

0 

3 

3 

Merc  15QSLC 

15 . 2 

8 

275  .  S 

ISO 

3 . 07 

3 . 7S0 

IS  .  00 

0 

0 

3 

3 

Cadillac  Fleetwood 

10 . 1 

8 

172 . 0 

205 

2 . 93 

5.250 

17 . 9S 

0 

0 

3 

1 

Lincoln  Continental 

10 . 1 

S 

160 . 0 

215 

3 . 00 

5 . 121 

17 . 52 

0 

0 

3 

1 

Ctir  ys  1  e  r  Iirp  e  r  i  a  1 

11 . 7 

S 

110 . 0 

230 

3 . 23 

5 . 315 

17 . 12 

0 

0 

3 

1 

Fiat  128 

32 . 1 

1 

7S  .  7 

6  6 

1  .  OS 

2 . 200 

19 . 17 

1 

1 

1 

1 

Honda  Civic 

30 . 1 

1 

75 . 7 

52 

1 . 93 

1 . 615 

IS  .  52 

1 

1 

1 

2 

Toyota  Corolla 

33 . 9 

1 

71 . 1 

65 

1 . 22 

1  .  S35 

19 . 90 

1 

1 

1 

1 

Toyota  Corona 

21 . 5 

1 

120 . 1 

97 

3 . 70 

2.165 

20 . 01 

1 

0 

3 

1 

Dodge  Challenger 

15 . 5 

8 

31S  .  0 

150 

2 . 76 

3 . 520 

16 . 87 

0 

0 

3 

2 

AMC  Javelin 

15 . 2 

S 

301 . 0 

150 

3 . 15 

3 . 135 

17 . 30 

0 

0 

3 

2 

C  arca  r  o  Z  2  8 

13 . 3 

8 

350 . 0 

215 

3 . 73 

3  .  S10 

15 . 11 

0 

0 

3 

1 

Pontiac  Firefcird 

19 . 2 

8 

100 . 0 

175 

3  .  OS 

3  .  S15 

17 . 05 

0 

0 

3 

2 

Fiat  XI  —  9 

27 . 3 

1 

79 . 0 

6  6 

1  .  OS 

1 . 935 

IS  .  90 

1 

1 

1 

1 

F  o  r  s  che  911-2 

26 . 0 

1 

120 . 3 

91 

1 . 13 

2 . 110 

16 . 70 

0 

1 

5 

2 

Lotus  Europa 

30 . 1 

1 

95 . 1 

113 

3 . 77 

1 . 513 

16 . 90 

1 

1 

5 

2 

Ford  Pant era  L 

15 . 6 

8 

351 . 0 

261 

1 . 22 

3 . 170 

11 . 50 

0 

1 

5 

1 

Ferrari  Dino 

19 . 7 

6 

115 . 0 

175 

3 . 62 

2 . 770 

15 . 50 

0 

1 

5 

6 

Maser ati  Bora 

15 . 0 

S 

301 . 0 

335 

3 . 51 

3 . 570 

11 . 60 

0 

1 

5 

8 

Volvo  112E 

21 . 1 

1 

121 . 0 

109 

1 . 11 

2 . 7S0 

IS  .  60 

1 

1 

1 

2 
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Dropping  variables: 


>  data<-  names  (nut cars)  %in%c  (  "mpgwr  wcylw) 

>  r.ew<-  mtcars  [  l  data] 

>  new 


disp 

hp 

drat 

wt 

gsec 

vs 

am. 

gear 

carb 

Mazda  RX4 

160 . 0 

110 

3.90 

2.620 

16.46 

0 

1 

4 

4 

Mazda  RX4  Wag 

160.0 

110 

3.90 

2.875 

17.02 

0 

1 

4 

4 

Datsun  710 

108.0 

93 

3.85 

2.320 

18. 61 

1 

1 

4 

1 

Hornet  1  Drive 

258 . 0 

110 

3.08 

3.215 

19.44 

1 

0 

3 

1 

Hornet  Sport about 

360 . 0 

175 

3.15 

3.440 

17.02 

0 

0 

3 

2 

Valiant 

225.0 

105 

2.76 

3.460 

20.22 

1 

0 

3 

I 

Duster  360 

360.0 

245 

3.21 

3.570 

15.84 

0 

0 

3 

4 

Merc  2 4 0 D 

146 . 7 

62 

3.69 

3.190 

20.00 

1 

0 

4 

2 

Merc  230 

140.8 

95 

3.92 

3.150 

22. 90 

1 

0 

4 

2 

Merc  280 

167 . 6 

123 

3.92 

3.440 

18.30 

1 

0 

4 

4 

Merc  280C 

167 . 6 

123 

3.92 

3.440 

18. 90 

1 

0 

4 

4 

Merc  450SE 

275.8 

180 

3.07 

4.070 

17.40 

0 

0 

3 

3 

Merc  4505L 

275.8 

180 

3.07 

3.730 

17. 60 

0 

0 

3 

3 

Merc  450SLC 

275.8 

180 

3.07 

3.780 

18.00 

0 

0 

3 

3 

Cadillac  Fleetwood 

472.0 

205 

2. 93 

5.250 

17.98 

0 

0 

3 

4 

Lincoln  Continental 

460.0 

215 

3.00 

5.424 

17.82 

0 

0 

3 

4 

Chrysler  Imperial 

440.0 

230 

3.23 

5.345 

17.42 

0 

0 

3 

4 

Fiat  128 

78.7 

66 

4.08 

2.200 

19.47 

1 

1 

4 

1 

Honda  Civic 

75 . 7 

52 

4. 93 

1.615 

18.52 

1 

1 

4 

2 

Toyota  Corolla 

71.1 

65 

4.22 

1.835 

19. 90 

1 

1 

4 

1 

Toyota  Corona 

120 . 1 

97 

3.70 

2.4  65 

20.01 

1 

0 

3 

1 

Dodge  Challenger 

318.0 

150 

2.76 

3.520 

16.87 

0 

0 

3 

2 

AMC  Javelin 

304.0 

150 

3.15 

3.435 

17.30 

0 

0 

3 

2 

Caitaro  Z28 

350.0 

245 

3.73 

3.840 

15.41 

0 

0 

3 

4 

Pontiac  Firebird 

400.0 

175 

3.08 

3.845 

17.05 

0 

0 

3 

2 

Fiat  XI- 9 

79.0 

66 

4.08 

1.935 

18.90 

1 

1 

4 

1 

Porsche  914-2 

120.3 

91 

4.43 

2.140 

16.70 

0 

1 

5 

2 

Lotus  Europa 

95.1 

113 

3.77 

1.513 

16.90 

1 

1 

5 

2 

Ford  Pantera  L 

351.0 

264 

4.22 

3.170 

14.50 

0 

1 

5 

4 

Ferrari  Dino 

145.0 

175 

3.62 

2.770 

15.50 

0 

1 

5 

6 

Maserati  Bora 

301.0 

335 

3.54 

3.570 

14. 60 

0 

1 

5 

8 

Volvo  142  E 

121.0 

109 

4.11 

2.780 

18. 60 

1 

1 

4 

2 
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Keeping  variables: 


>  r.ewlc-  nitcars  [data! 

>  r.ewl 

irpg  cyl 


Mazda  RX4 

21. G 

Mazda  RX4  Wag 

21.  G 

Da t sun  71G 

22.8 

Hornet  4  Drive 

21.4 

Hornet  Sport about 

18.7 

Valiant 

18.1 

Duster  36G 

14.3 

Merc  24GD 

24.4 

Merc  23Q 

22.8 

Merc  280 

19.2 

Merc  28GC 

17.8 

Merc  45GSE 

16.4 

Merc  45G5L 

17.3 

Merc  45G5LC 

15.2 

Cadillac  Fleetwood 

1Q.4 

Lincoln  Continental 

1G.4 

Chrysler  Imperial 

14.7 

Fiat  128 

32.4 

Honda  Civic 

30. 4 

Toyota  Corolla 

33.9 

Toyota  Corona 

21.5 

Dodge  Challenger 

15.5 

AMC  Javelin 

15.2 

Cair.aro  Z28 

13.3 

Pontiac  Firebird 

19.2 

Fiat  XI- 9 

27.3 

Forsche  514-2 

26.0 

Lotus  Europa 

30.4 

Ford  Fantera  1 

15.8 

Ferrari  Dino 

19.7 

Maserati  Bora 

15.  G 

Volvo  142E 

21.4 

Interpretation: 

We  learnt  the  R  coding 


6 

6 

4 

6 

8 

6 

8 

4 

4 

6 

6 

8 

8 

8 

8 

8 

8 

4 

4 

4 

4 

8 

8 

8 

8 

4 

4 

4 

8 

6 

8 

4 


dropping  and  keeping  variables  in  R. 
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SUBSETTING  DATA 


Exercise:  12 
Date:  3.12.2019 
Aim: 


To  know  how  to  sub-setting  data  in  R. 


R  Code  and  Output: 


>  r.ewdata  <-  mtcars  [which  (mtcars§CYl=3£mtcars$carb>=4)  r  ] 

>  r.ewdata 


irpg 

cyl 

disp 

hp 

drat 

wt 

qsec 

vs 

air. 

gear 

carb 

Duster  360 

14.3 

8 

360 

245 

3.21 

3.570 

15.84 

0 

0 

3 

4 

Cadillac  Fleetwood 

10.4 

8 

472 

205 

2.93 

5.250 

17. 98 

0 

0 

3 

4 

Lincoln  Continental 

10.4 

8 

460 

215 

3.00 

5.424 

17.82 

0 

0 

3 

4 

Chrysler  Iirperial 

14.7 

8 

440 

230 

3 . 23 

5.345 

17.42 

0 

0 

3 

4 

Camaro  Z2B 

13.3 

8 

350 

245 

3.73 

3.840 

15.41 

0 

0 

3 

4 

Ford  Fantera  L 

15.8 

8 

351 

264 

4.22 

3.170 

14.50 

0 

1 

5 

4 

Maserati  Bora 

15.0 

8 

301 

335 

3.54 

3.570 

14.60 

0 

1 

5 

8 

>  # - subsetting  data - # 

>  # - Selecting  first  three  rows - # 

>  newdatac-  mtcars [ 1 : 3 f ] 

>  r.ewdata 


irpg  cyl  disp  hp  drat  wt  qsec  vs  air.  gear  cart 


Mazda  RX4  21.0 

6 

160  . 

110  3. 

90  2. 

620  , 

16.46 

0  1 

4 

4 

Mazda  EX4  Wag  21.0 

6 

160  . 

110  3. 

90  2. 

875 

17.02 

0  1 

4 

4 

Datsun  710  22.8 

4 

108 

93  3. 

85  2. 

320  , 

18.61 

1  1 

4 

1 

>  r.ewdata  <-  mtcars  [which  (nitcars$cyl==3 

t)  r  I 

>  r.ewdata 

irpg 

cyl 

disp 

hp 

drat 

wt 

qsec 

vs 

am. 

gear 

carb 

Hornet  Sport about 

18.7 

8 

360.0 

175 

3.15 

3.440 

17.02 

0 

0 

3 

2 

Duster  360 

14.3 

8 

360.0 

245 

3.21 

3.570 

15.84 

0 

0 

3 

4 

Merc  450SE 

16.4 

8 

275.8 

180 

3.07 

4.070 

17.40 

0 

0 

3 

3 

Merc  450SL 

17.3 

8 

275.8 

180 

3.07 

3.730 

17.60 

0 

0 

3 

3 

Merc  4505LC 

15.2 

8 

275.8 

180 

3.07 

3.780 

18.00 

0 

0 

3 

3 

Cadillac  Fleetwood 

10.4 

8 

472.0 

205 

2 . 93 

5.250 

17.98 

0 

0 

3 

4 

Lincoln  Continental 

10.4 

8 

460.0 

215 

3.00 

5.424 

17.32 

0 

0 

3 

4 

Chrysler  Imperial 

14.7 

8 

440.0 

230 

3.23 

5.345 

17.42 

0 

0 

3 

4 

Dodge  Challenger 

15.5 

8 

318.0 

150 

2.76 

3.520 

16.87 

0 

0 

3 

2 

AMC  Javelin 

15.2 

8 

304.0 

150 

3.15 

3.435 

17.30 

0 

0 

3 

2 

Cam.aro  Z28 

13.3 

8 

350.0 

245 

3.73 

3.840 

15.41 

0 

0 

3 

4 

Pontiac  Firebird 

19.2 

8 

400.0 

175 

3.08 

3.845 

17.05 

0 

0 

3 

2 

Ford  Fantera  L 

15.8 

8 

351.0 

264 

4.22 

3.170 

14.50 

0 

1 

5 

4 

Maserati  Bora 

15.0 

8 

301.0 

335 

3.54 

3.570 

14.60 

0 

1 

5 

8 

>  I 
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Interpretation: 

We  learnt  the  R  coding  for  sub  setting  data  in  R. 


SELECTING  RANDOM  SAMPLES  FROM  THE  DATASET 


Exercise:  13 
Date:  3.12.2019 
Aim: 


To  know  how  to  select  random  sample  of  size  n  in  R 


R  Code  and  Output: 


>  # - Select  random  sample  of  size  n - # 

>  id<-read.  table  (  nU :  \\ugst46\\Boolcl .  csvn  f  header=Tf  sep=n  f  n  ) 

>  samplel  <-  id  [  sample  ( 1 :  mow  ( id)  f  4  f  replace=  FALSE )  f  ] 

>  samplel 

5. No  Mean. arterial .blood. pressure  Age  Weight  Body. Surface  Heart. Beat 


5 

5 

1 12 

51 

8  9.4 

1.89 

72 

e 

8 

110 

47 

90 . 9 

1.90 

6  6 

i 

1 

105 

47 

85.4 

1.75 

63 

9 

9 

110 

4  B 

89.2 

1.83 

69 

> 

>  I 


Interpretation: 

We  leamt  the  R  coding  for  selecting  random  samples  from  the  dataset. 
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AGGREGATE  DATASET 


Exercise:  14 
Date:  3.12.2019 


Aim: 


To  know  how  to  aggregate  dataset  in  R 


R  Code  and  Output: 


>  # - aSSRESATE  DATASET - # 

>  value<-  data . frame (custid=c (1F 1,  2,  3,  3, 3,  4F  5F  5F  6F  7F  7F  SF  9F  9F 10)  , 

+  purvalue=c {125,154,136,245,156,393,456,563,459,235,654,547,126,154,137,145) ) 

>  value 


cast Id 

puz value 

1 

1 

125 

2 

1 

154 

3 

2 

136 

4 

3 

245 

5 

3 

15  6 

6 

3 

398 

7 

4 

456 

S 

5 

568 

9 

5 

459 

10 

6 

235 

11 

7 

654 

12 

7 

547 

13 

8 

126 

14 

9 

154 

15 

9 

187 

16 

10 

145 

>  1 
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>  valueagg<-aggregate (value$purvaluer  by=list (value$custid) r  FUN  =  mean) 


> 

valueagg 

Group . 1 

X 

1 

1 

139. 5000 

2 

2 

136.0000 

3 

3 

266.3333 

4 

4 

456 . 0000 

5 

5 

513. 5000 

6 

6 

235.0000 

7 

7 

600.5000 

8 

8 

126.0000 

9 

9 

170.5000 

10 

10 

145.0000 

> 

> 


>  valueagg<-aggregate ( value? pur value r 

>  valueagg 

Gr  oup  .  1  x 


1 

2 

3 

1 

5 

€ 

7 

8 

9 

10 

> 

> 


1  279 

2  136 

3  799 

1  156 

5  1027 

6  235 

7  1201 

8  126 

9  311 

10  115 

valu«agg< -aggregate ( value $ pur value f 
valueagg 


1 

2 

3 

1 

5 

6 

7 

8 

9 

10 


Gr  o  up . 1 
1 
2 
3 
1 

5 

6 

7 

8 
9 

10 


x 

20.50610 

NA 

122 . 10231 
NA 

77 . 07161 
NA 

75 . 66013 
NA 

23 . 33152 
NA 


>  valueaggC-aggregate (value^purvalue, 


>  valueagg 


1 

2 

3 

1 

5 

6 

7 

8 

9 

10 


Gr  o  up  .  1  x 

1  125 

2  136 

3  156 
1  15  6 

5  15  9 

6  235 

7  517 

8  126 
9  151 

10  115 


>  1 


by=list (value$cu3tid) f 


by=list ( value $ oust id) f 


bv=list (value^custid) f 


FUN 


FUN 


FUN 


sum) 


sd) 


min ) 
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Interpretation: 


We  learnt  the  R  coding  for  aggregating  dataset. 


MERGING  DATASET 


Exercise:  15 
Date:  4.12.2019 
Aim: 

To  know  how  to  merge  dataset  in  R 

R  Code  and  Output: 

>  # - merging  dataset - # 

>  customer<-  data . f rams (cu3tid=c { 1 F 2 F 3F 4 r 5 F 6 r 7 r 3 r 9 F 10) F 
+  gender=c (wM"r nFn F "Mn  F n F  n Fn  r nMn  f nFn  F  nMn  F WFW,  WMW> F 

+  a  g e= c (25F35F65F45F28F61F49F54F36F45) ) 

>  product<-  data  .  f  rame  ( custid=c  ( 1 F  2  F  3  F  4  F  5  F  7  F  3  F  9  F 12  F 13 )  F 

+  pro  duct  c  o  de=c  (  nAl"F  nA2n  F  nA3"F  nA^n^BInF  nB2  n  F  WB3"  F  ™C1W  F  WC2W  F  WC3W >  > 

>  customer 

cnstid  gender  age 

1  1  M  2  5 

2  2  F  35 

3  3  M  65 

4  4  F  45 

5  5  F  23 

6  6  M  61 

7  7  F  49 

S  5  M  54 

9  9  F  36 

10  10  M  45 

>  product 

cus t id  productcode 

11  A1 

2  2  A2 

3  3  A3 

4  4  A4 

5  5  31 


6 

7 

S 

9 

10 


7 

8 
9 

12 

13 


32 

33 
Cl 
C2 
C3 


>  I 
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>  # - - full  join - # 

>  merge da ta<-  merge  (customer  f  product  r  by  .x=ncus t id"  r  by .  y="custid"  f  ali=T) 

>  mergedata 


cast id 

gender 

age 

productcode 

1 

1 

M 

25 

A1 

2 

2 

F 

35 

A2 

3 

3 

M 

65 

A3 

4 

4 

F 

45 

A4 

5 

5 

F 

2S 

B1 

6 

6 

M 

61 

<NA> 

7 

7 

F 

49 

B2 

8 

S 

M 

54 

B3 

5 

9 

F 

36 

Cl 

10 

10 

M 

45 

<NA> 

11 

12 

<NA> 

NA 

C2 

12 

13 

<NA> 

NA 

C3 

> 

>  # - INNER  JOIN - # 

>  mergedatalc-merge (customer f product f toy .x=ncustidn f toy - y=ncustidn  f aII=F) 

>  mergedatal 


cast id  gender  age  product code 


1 

1 

M 

25 

A1 

2 

2 

F 

35 

A  2 

3 

3 

M 

65 

A3 

4 

4 

F 

45 

A4 

5 

5 

F 

28 

B1 

6 

7 

F 

49 

B2 

7 

8 

M 

54 

B3 

S 

9 

F 

36 

Cl 

> 

>  # - RIGHT  CUTER  JOIN - * 

>  mergedata2<-merge  (customer  f  product  f  toy  . x=ncustidn  f  toy .  y="custid"  f  all .  y=T ) 

>  mergedata2 


cast id 

gender 

age 

productcode 

1 

1 

M 

25 

A1 

2 

2 

F 

35 

A  2 

3 

3 

M 

65 

A3 

4 

4 

F 

45 

A4 

5 

5 

F 

28 

B1 

6 

7 

F 

49 

B2 

7 

8 

M 

54 

B3 

8 

9 

F 

36 

Cl 

9 

12 

<NA> 

NA 

C2 

10 

13 

<NA> 

NA 

C3 

> 
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>  * - left  cuter  join - * 

>  mergedata3<-:nierge  (customer  f  product  f  by .  x=prcustidwr  by .  y=prcustid"  ,  all .  x=T) 

>  mergedataS 


cast id 

gender 

age 

productcode 

1 

1 

M 

25 

A 1 

2 

2 

F 

35 

A2 

3 

3 

M 

65 

A3 

1 

F 

15 

A! 

5 

5 

F 

28 

SI 

6 

6 

M 

61 

<NA> 

7 

7 

F 

19 

B2 

8 

8 

M 

51 

S3 

9 

9 

F 

36 

Cl 

10 

10 

M 

15 

<NA> 

> 


Interpretation: 

We  learnt  the  R  coding  to  merge  the  dataset  in  R. 
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STACKING  DATSET 


Exercise:  16 
Date:  5.12.2019 
Aim: 


To  know  how  to  stack  dataset  in  R 

R  Code  and  Output: 

>  # - row  bind  stacking - # 

>  id<-seq (lr Sr 1} 

>  incomeC-c (13000, 15000,21000, 16000, 17000, 1650G, 14000, 13500) 

>  gender <-c [*FW, "M",  "Mw,  nFnf  "M",  nFnf  nMn f  WFW) 

>  empdetailc-data  .  frame  ( id,  income  f  gender ) 

>  id<-seq (5, 16, 1) 

>  income<-c  (16G00,  14000, 11000, 2.6000, 27000, 36500,  24000,23500) 

>  gender <-c (WFW, "Mw f nMn  ,  n  Fn ,  nMn  f  nFn  f nMn  ,  n  F" ) 

>  empdetail2<-data . frame (id, income,  gender) 

>  empdata<-rbir_d  (empdetail,  empdetail2  ) 

>  empdata 


id 

income 

gender 

1 

1 

18000 

F 

2 

2 

15000 

M 

3 

3 

21000 

M 

4 

4 

16000 

F 

5 

5 

17000 

M 

6 

6 

16500 

F 

7 

7 

14000 

M 

8 

8 

13500 

F 

9 

9 

16000 

F 

10 

10 

14000 

M 

11 

11 

11000 

M 

12 

12 

26000 

F 

13 

13 

27000 

M 

14 

14 

36500 

F 

15 

15 

24000 

M 

16 

16 

23500 

F 

>  1 

Interpretation: 

We  learnt  the  R  coding  to  stack  dataset  in  R. 
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SIMPLE  BARCHART 


Exercise:  17 
Date:  6.12.2019 
Aim: 


To  know  how  to  visualize  dataset  in  R. 

R  Code  and  Output: 

>  id<-seq (lr 3r I) 

>  Improvement<-c  (  nNor.e"  f  n5om.en  f  "Marked" ) 

>  Frequer.cy<-c  (32  f  f  23) 

>  - SIMPLE  BAR  CHART  USING  BCXPLCT - # 

>  barplot  (Frequency, width=lfma±n=n Simple  Bar  Chart n  ,  sub=  ^Treatment  details  for 
+  Ar  thr  i  t  i  s  w  w  n  ame  s=  n  Imp  r  o vemen  t n  ,  xl  ab=  n  Imp  r  o  vem.en  t "  f  y  I  ab=  "Frequency"^ 

+  col=c  (wredwr  wpinkwr  "green”)  r  border=T) 


Simple  Bar  Chart 


None  Some  Marked 

Improvement 

Treatment  details  for  Arthritis 
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>  # - HORIZONTAL  BAR  CHART  USING  BCXPLCT - fr 

>  barplot  (Frequency rwidtti=lfmaiii=w Simple  Bar  Chart n  f  3ub=  "Treatment  details  for  Arthritis n  f 
4  n ame  s = Imp  r  ovenier.  t  f  x  1  ab =  n  Fr  e  qu  en  c  y  "  f  y  1  ab=  n  Imp  r  o vernier,  t n  f 

+  coI=c  (  nredri  f  npir.k:n  f  "greet "  )  r  border=T ,horiz=T) 


Simple  Bar  Chart 


Interpretation: 

We  learnt  the  R  coding  to  visualize  simple  bar  chart  using  bar  plot  function. 
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LINE  CHART 

Exercise:  18 
Date:  9.12.2019 
Aim: 

To  know  how  to  create  line  chart  in  R. 

R  Code  and  Output: 


>  # - CREATE  TEE  DATA  FOR  TEE  CEART - * 

>  v<-c  (7, 12,28, 3,  'll} 

>  * - SIVE  TEE  CEART  FILE  NAME - # 

>  png (f ile=" line chart . jpgw) 

>  # - PLOT  THE  CHART - # 

>  plot  (vf  type=non  f  col="  red"  fxlab=  "month"  r  ylab=nrair_f  all"  fiuain=n  Rainfall  Line  Chart" 

>  # - SAVE  THE  FILE - # 

>  dev. off ( ) 
null  device 


Rainfall  Line  Chart 


AJEETH.H  |  17-UST-046  |  PG  &  RESEARCH  DEPARTMENT  OF  STATISTICS,  LOYOLA  COLLEGE,  CHENNAI-34 


34 


>  # - MULTIPLE  LINE  IN  A  CHART - # 

>  $ - CREATE  THE  DATA  FOR  THE  CHART - * 

>  v<-c (7,12 r  23, 3r  41) 

>  t<-c  (14,7,6,19,3) 

>  # - GIVE  THE  CHART  FILE  A  NAME - * 

>  png ( file=" multiple line chart . ~pgn ) 

>  $ - PLOT  THE  CHART - # 

>  plot  (vf  type=p,on  f  col=nredn  rxlab=wmonthw  f  ylab=  n  rainfall n  fmain=n  Rainfall  Chart" ) 

>  lines  ( t , type=non  , co l=n green n ) 

>  # - SAVE  THE  FILE - * 

>  dev. off ( ) 
null  device 

1 

>  I 


Rainfall  Chart 


Interpretation: 

We  learnt  the  R  coding  to  visualize  Line  chart  in  R. 


AJEETH.H  |  17-UST-046  |  PG  &  RESEARCH  DEPARTMENT  OF  STATISTICS,  LOYOLA  COLLEGE,  CHENNAI-34 


35 


PIE  CHART 


Exercise:  19 
Date:  9.12.2019 
Aim: 


To  know  how  to  visualize  Pie  charts  in  R. 

R  Code  and  Output: 

>  * - FIE  CHART - * 

>  3lices<-c (10, 12 , ^ , 16, 3 , 37) 

>  lbls<-c  {"US",  "UK",  "AUSTRALIA",  "CHINA",  "  FRANCE  "  ,  "INDIA") 

>  pct<-roimd [slices/ sum (slices) *100) 

>  Ibls2<-pa3te [Ibis, " " ,pct, "%", sep=" " ) 

>  pie (slices, labels=lbls2 r col=rainbow (length (lbls2) ) , main=" Business  Volume") 

> 

> 

>  l 


Business  Volume 


Interpretation: 

We  learnt  the  R  coding  to  visualize  Pie  Chart  in  R. 
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GROUPED  BAR  CHART 


Exercise:  20 
Date:  11.12.2019 
Aim: 

To  know  how  to  visualize  (multiple  bar  diagrams)  using  bar  plotfunction  in  R. 

R  Code  and  Output: 

>  # - GROUPED  BAR  CHART - $ 

>  courtsc-table  (mtcars$vs,mtcar3$gear) 

>  barplot ( counts, main="Car  Distribution  by  Sear  and  VS",  xiab= "Number  of  Searn,yiab="VSn, 
+  col=c ("#FF4242", "#33CC66") , 1 e  gend=r  owname  s ( count  s ) ,beside=T) 

> 

> 


Number  of  Gear 


Interpretation: 

We  leamt  the  R  coding  to  visualize  (Multiple  Bar  Diagram)  the  dataset  using  bar  plot  function. 
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STACKED  BAR  CHART 


Exercise:  21 
Date:  11.12.2019 
Aim: 

To  know  how  to  visualize  stacked  Bar  Chart  the  datasets  using  bar  plot  function  in  R. 

R  Code  and  Output: 

>  # - STACKED  BAR  CHART - * 

>  countsC-table (mtcars$vsr mtcars$gear ) 

>  barplot (counts fmain= "Car  Distribution  by  Gear  and  VS"  f  xlab=n Number  of  Gear"  ,  yIab="V5"  f 
-f  coI=c  ( nbluen  f  "pink" )  f  legend=rownames  (counts) ) 


Car  Distribution  by  Gear  and  VS 


n  i 


3 


4 

Number  of  Gear 


5 


Interpretation: 

We  learnt  the  R  coding  to  visualize  (Stacked  Bar  Diagram)  the  dataset  using  bar  plot  function. 
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HISTOGRAM 

Exercise:  22 
Date:  11.12.2019 
Aim: 

To  know  how  to  visualize  Histogram  in  R. 

R  Code  and  Output: 


>  # - HISTOGRAM - # 

>  hist  (mtcars$mpgrfcireak:s=12  r col=wlight  blue") 


Histogram  of  mtcars$mpg 


10  15  20  25  30 

mtcarsSmpg 
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>  AirPassengers 


Jan 

Feb 

Mar 

Apr 

May 

Jun 

JlQi 

Aug 

Sep 

Oct 

Nov 

Dec 

1949 

112 

118 

132 

129 

121 

135 

148 

148 

136 

119 

104 

118 

1950 

115 

126 

141 

135 

125 

149 

170 

170 

158 

133 

114 

140 

1951 

145 

150 

178 

163 

172 

178 

199 

199 

184 

162 

146 

166 

1952 

171 

180 

193 

181 

183 

218 

230 

242 

209 

191 

172 

194 

1953 

196 

196 

236 

235 

229 

243 

264 

272 

237 

211 

180 

201 

1954 

204 

188 

235 

227 

234 

264 

302 

293 

259 

229 

203 

229 

1955 

242 

233 

267 

269 

270 

315 

364 

347 

312 

274 

237 

278 

1956 

284 

277 

317 

313 

318 

374 

413 

405 

355 

306 

271 

306 

1957 

315 

301 

356 

348 

355 

422 

465 

467 

404 

347 

305 

336 

1958 

340 

318 

362 

348 

363 

435 

491 

505 

404 

359 

310 

337 

1959 

360 

342 

406 

396 

420 

472 

548 

559 

463 

407 

362 

405 

1960 

417 

391 

419 

461 

472 

535 

622 

606 

508 

461 

390 

432 

>  hist (AirPassengers rmain=wHISTOGRAM  OF  AIR  PAS SENGERS n f xl ab=  n P a s s eng ersn,bor de r = n r ed n f co 1=  "p ink nfbrea ts=  5  f  p  r ob=  nT  RUE " ) 

> 

>  I 


HISTOGRAM  OF  AIR  PASSENGERS 


o 

CO 

O 

O 

o 


o 

CM 

O 

O 


±=  o 
w 
s = 

<D 

Q 


o 

o 

o 


o 

o 

o 

o 

o 


100 


200 


300 


400 


Passengers 


500 


600 


700 
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>  * - HISTOGRAM  WITH  NORMAL  CURVE - - # 

>  x=c {4.14, 4.14, 4.16, 4.15,4.19,4.13,4.16,4.17) 

>  hist £x) 

>  hist  (xf  coI= n green n ,xlim=c  {4.10,4.22}  ,  ylim=c  (0,20)  ,probaloility=TRIIE) 

>  s=sd(x) 

>  m=mean (x) 

>  curve (dnorm (x, mean=m, sd=s) ,  add=TRUE ) 

> 


Histogram  of  x 


& 

trt 

S= 

<D 

Q 


o 

CM 


U~> 


o 


X 


Interpretation: 

We  learnt  the  R  coding  to  visualize  Histogram  in  R. 
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BOXPLOT 


Exercise:  23 
Date:  12.12.2019 
Aim: 

To  know  how  to  visualize  Box  plot  in  R. 

R  Code  and  Output: 

>  boxplot  (mpg~cylFdata=nitcars|rniaiii=wCar  Mileage  Data n  r  xlab=  "Number  of  Cylinders"  f  yiab="Miles  per  Gallon") 

>  I 


Car  Mileage  Data 


O 

CO 


o  m 

=  CM 

CD 

(3 

Cl 

M  o 


ITJ 


o 

~r 


6 


Number  of  Cylinders 


Interpretation: 

We  learnt  the  R  coding  to  visualize  Box  plot  in  R. 


AJEETH.H  |  17-UST-046  |  PG  &  RESEARCH  DEPARTMENT  OF  STATISTICS,  LOYOLA  COLLEGE,  CHENNAI-34 


42 


SCATTER  PLOT 


Exercise:  24 
Date:  12.12.2019 
Aim: 


To  know  how  to  visualize  Scatter  Plot. 

R  Code  and  Output: 

>  it - SCATTER  PLOT - fr 

>  x<-c{2.2,3,3.8,4.5,7,8.5,6.7,5.5) 

>  y<-c (4, 5. 5,4. 5, 9,11,15. 2,13. 3,10. 5) 

>  frFlot  points 

>  plot (x, y) 

>  #Char.gir.g  piottting  symbol 

>  frUse  3olid  circle 

>  plot (x, y,pch=  19) 


Interpretation: 

We  learned  the  R  codings  to  visualize  the  Scatter  plot  in  R. 
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ONE  SAMPLE  t  TEST 


Exercise:  25 
Date:  31.01.2020 

The  lifetime  of  electric  bulbs  for  a  random  sample  of  10  from  a  large  consignment  gave  the 
following  data: 


Item 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

Life  in  ‘000  hrs 

4.2 

4.6 

3.9 

4.1 

5.2 

3.8 

3.9 

4.3 

4.4 

5.6 

lat  the  average  life  of  bulbs  is  4000  hours? 


Can  we  accept  the  hypothesis  tt 

Null  Hypothesis: 

H0:  The  samples  have  come  from  the  same  population. 
(Or)  The  average  lifetime  of  bulbs  is  4000  hours. 

Alternative  Hypothesis: 

Hi:  The  samples  have  come  from  the  different  population. 
(Or)  The  average  lifetime  of  bulbs  is  not  4000  hours. 


R  Code  and  Output: 

>  lif e<-c (4. 2, 4. 6, 3. 5, 4. 1,5. 2, 3. 3, 3.5,4. 3, 4. 4, 5. 6) 

>  t .  test  (lif  efmu=4  ) 

One  Sairple  t-test 
date:  life 

t  =  2. 1483 f  df  =  9,  p-value  =  0.0602 

alternative  hypothesis :  true  ir.ean  is  not  equal  to  4 

95  percent  confidence  interval: 

3.978809  4.821191 
sairple  estimates: 
mean  of  x 
4.4 


Interpretation: 

Since  the  p-value  is  0.060  which  is  greater  than  0.05,  we  accept  the  null  hypothesis.  Hence  the 
average  lifetime  of  bulbs  is  4000  hours. 
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ONE  SAMPLE  t  TEST 

Exercise:  25 
Date:  1.02.2020 

The  average  breaking  strength  of  steel  rod  is  specified  to  be  17.5.  To  test  a  sample  of  14  rods  was 
test  and  the  following  results  were  obtained. 

15  18  16  21  17  20  19  17  18  17  15  17  20 

19 

Test  at  5%  level  of  significance. 

Null  Hypothesis: 

H0:  The  samples  have  come  from  the  same  population. 

(Or)  The  average  breaking  strength  of  steel  rod  is  17.5. 

Alternative  Hypothesis: 

Hi:  The  samples  have  come  from  different  population. 

(Or)  The  average  breaking  strength  of  steel  rod  is  17.5. 
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>  breakingstrength<-read.  table  ("U: \\ugst46\\2  .  csvw,  header  =  T,  sep  =  n  ,  n 

>  breakingstrength 


breaking. strength 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 
11 
12 

13 

14 


15 
18 

16 
21 
17 
20 

19 

17 

18 
17 
15 
17 

20 
19 


>  t .  test  (br eakingstrength  hiu=17 . 5 ) 


One  Sample  t-test 


data:  br ea kings tr eng th 

t  =  0.57874,  df  =  13,  p- value  =  0.5727 

alternative  hypothesis:  true  mean  is  not  equal  to  17.5 
95  percent  confidence  interval: 

16.71918  18.85225 
sample  estiir.ates : 
rr.ean  of  x 
17.78571 

Interpretation: 

Since  the  p  value  is  0.5727  which  is  greater  than  0.05,  we  accept  the  null  hypothesis.  Hence  the 
average  breaking  strength  of  steel  rod  is  17.5. 
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T  TEST  FOR  DIFFERENCE  OF  MEANS  (INDEPENDENT  SAMPLES) 

Exercise:  26 
Date:  3.02.2020 

To  compare  the  prices  of  a  certain  commodity  in  two  town’s  nine  shops  were  selected  at  random  in 
each  town.  The  following  figures  give  the  price  found: 


Town  A 

61 

56 

63 

56 

63 

59 

56 

44 

61 

Town  B 

55 

47 

59 

51 

61 

57 

54 

64 

58 

Test  whether  the  average  price  can  be  said  to  be  the  same  in  two  towns. 

Null  Hypothesis: 

Ho:  There  is  no  significant  difference  between  the  average  price  of  Town  A  and  Town  B. 

Alternative  Hypothesis: 

Hi:  There  is  a  significant  difference  between  the  average  price  of  Town  A  and  Town  B. 

R  Code  &  output: 

> 

>  t ownA< -c (61,56,63,56,63,59,56,44,61) 

>  townB<-c  ( 55  ,  4  7,  59  ,  51  r  61  ,  57  ,  54  ,  64  ,  53  ) 

>  t .  test  ( townA,  towr.B,  var  .  equal=TRUE) 

Two  Sample  t-test 
data:  townA  and  townB 

t  =  0.55394,  df  =  16,  p- value  =  0.5373 

alternative  hypothesis:  true  difference  in  m.eans  is  not  equal  to  0 
95  percent  confidence  interval : 

-4.08334!  6.972230 

sample  estimates: 
mean  of  x  mean  of  y 
57.66667  56.22222 

Interpretation: 

Since  the  p  value  is  0.5873  which  is  greater  than  0.05,  we  accept  the  null  hypothesis.  Hence  there  is 
no  significant  difference  between  the  average  price  of  Town  A  and  Town  B. 
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T  TEST  FOR  COMPUTING  2  MEANS  WITHOUT  ASSUMING  EQUAL  VARIANCE 


Null  Hypothesis: 

H0:  The  true  difference  in  means  is  equal  to  zero. 

Alternative  Hypothesis: 

Hi:  The  true  difference  in  means  is  not  equal  to  zero. 

R  Code  &  output: 

>  t .  test  ( townA,  towr.B) 

Welch  Two  Sample  t-test 


data:  townA  and  townB 

t  =  0.55391,  df  =  15.711,  p- value  =  0.5871 

alternative  hypothesis:  true  difference  In  means  Is  not  equal  to  0 
95  percent  confidence  interval: 

-1.090660  6.979519 

sanple  estimates: 
mean  of  x  mean  of  y 
57. 66667  56.22222 


Interpretation: 

Since  the  p  value  is  0.5874  which  is  greater  than  0.05,  we  accept  the  null  hypothesis.  Hence  there  is 
no  significant  difference  between  the  average  price  of  Town  A  and  Town  B. 
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PAIRED  T-TEST  (RELATED  SAMPLES) 

Exercise:  27 
Date:  5.02.2020 

To  verify  whether  a  course  in  accounting  improved  performance,  a  similar  test  was  given  to  12 
participants  both  before  and  after  the  course. 

The  original  marks  recorded  in  alphabetical  order  of  the  participants  were, 

44  40  61  52  32  44  70  41  67  72  53  and  72. 

After  the  course,  the  marks  were  in  the  same  order, 

53  38  69  57  46  39  73  48  73  60  and  78. 

Was  the  course  useful? 

Null  Hypothesis: 

Ho:  There  is  no  significant  difference  between  the  average  marks  of  before  and  after  course. 
(Or)  The  course  was  not  useful. 

Alternative  Hypothesis: 

Hi:  There  is  a  significant  difference  between  the  average  marks  of  before  and  after  course. 

(Or)  The  course  was  useful  (Or)  Hi:pi<|U2. 
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R  Code  &  output: 


>  beforeC-c  (44, 40, 61 r 52, 32, 44, 70, 41, 67 r 72 r 53 r 72 ) 

>  after<-c  (53,33,69,57,46,33,73,43,73,74,60,73) 

>  t. test  (before, af ter , paired=T, ait=nle33n) 

Faired  t-test 
data:  before  and  after 

t  =  -3.4454,  df  =  11,  p- value  =  G.QQ2736 

alternative  hypothesis:  true  difference  in  means  is  less  than  G 
95  percent  confidence  Interval: 

-Inf  -2.393763 
sample  estimates: 
mean  of  the  differences 

-5 


Interpretation: 

Since  the  p  value  is  0.002  which  is  less  than  0.05,  we  reject  the  null  hypothesis.  Hence  there  is  a 
significant  difference  between  the  average  marks  of  before  and  after  course.i.e.  The  course  was 
useful. 
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ONE  WAY  ANOVA 


Exercise:  28 
Date:  5.02.2020 


The  following  table  shows  the  lives  (in  hours)  of  four  brands  of  electric  bulbs.  Test  whether  the 
mean  lifetime  of  bulbs  the  same  across  the  different  brands  using  one  way  ANOVA. 


Brand 

Life  of  bulbs  in  Hours 

Philips 

1600 

1610 

1650 

1680 

1700 

1720 

1800 

LG 

1580 

1640 

1640 

1700 

1750 

Surya 

1460 

1550 

1600 

1620 

1640 

1660 

1740 

1820 

Other 

1510 

1520 

1530 

1570 

1600 

1680 

Null  Hypothesis: 

H0:  There  is  no  significant  difference  between  the  average  life  of  bulbs  towards  brands. 


(Or)  Ho:hi=(x2=P3=|44. 


Alternative  Hypothesis: 

Hi:There  is  no  significant  difference  between  the  average  life  of  bulbs  towards  brands. 
(Or)  (at  least  one  brand  differ). 


R  Code  &  output: 

>  life<-c (1600, 1610, 1650, 1630, 1700, 1720, 1300, 1530, 1640, 1640, 1700, 1750, 1460, 1550, 1600, 1620, 1640, 1660, 1740, 1320, 1510, 1520, 1530, 1570, 1600, 1630) 

>  brand<-c (rep ( "Philips" r 7)  ,rep  ("LG", 5)  ,rep  ("Surya",3) , rep ("Other" ,6) ) 

>  time<-data. frame  (life, brand) 

>  results<-aoY (life-brand, data=time ) 

>  sumary  (results) 

Df  Sum  Sq  Mean  Sq  F  value  Pi(>F) 
brand  3  44361  14787  2.145  0.123 

Residuals  22  151351  6880 

> 

Interpretation: 

Since  the  p  value  is  0.123  which  is  greater  than  0.05,  we  accept  the  null  hypothesis.  Hence  there  is 
no  significant  difference  between  the  average  life  of  bulbs  towards  brands. 
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TWOWAY  ANOVA 


Exercise:  29 
Date:  6.02.2020 


Three  different  methods  of  analysis  Ml,  M2,  M3  are  used  and  the  parts  per  million  defective 
obtained  for  five  different  analysts  are  shown  below.  Perform  a  two  way  analysis  of  variance  and 
test  the  significant  difference  in  part  per  million  defective  between  the  three  different  methods  and 
five  different  analysts. 


Methods 

Analyst 

Ml 

M2 

M3 

1 

7.5 

7 

7.1 

2 

7.4 

7.2 

6.7 

3 

7.3 

7 

6.9 

4 

7.6 

7.2 

6.8 

5 

7.4 

7.1 

6.9 

Null  Hypothesis: 

H0:  There  is  no  significant  difference  in  part  per  million  defective  between  the  three  methods 
H0:  There  is  no  significant  difference  in  part  per  million  defective  between  the  five  analysts. 

Alternative  Hypothesis: 

Hi:  There  is  a  significant  difference  in  part  per  million  defective  between  the  three  methods. 
Hi:  There  is  a  significant  difference  in  part  per  million  defective  between  the  five  analysts. 
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R  Code  &  output: 

>  defectiveOc  (7. 5,7,7. 1, 7. 4, 7. 2,6. 7, 7. 3, 7, 6. 5, 7. 6,  7. 2, 6. 3, 7. 4,7. 1,6. 5) 

>  Ar.alyst<-c  (rep  (  "Analyst  lwr  3)  ,  rep  ("Analyst  2",  3}  r  rep  ( "Analyst  3n  ,  3)  ,  rep  ( "Analyst  4",  3 J  ,  rep  ( "Analyst  5", 3)) 

>  Method<-rep (c ("Ml", "M2", "M3") ,5) 

>  ppnK-data . frame (Method, Analyst , defective ) 

>  twowa  y <  -  a  o v  ( de  f  e  c  t  i  ve  ~  Ar.  a  1  y s  t  -Me  tho  d ,  da  t  a=ppm ) 

>  summa  r  y ( twowa  y ) 


Df 

Sum  Sq 

Mean  Sq 

F  value 

Fr  (>F) 

Analyst 

4 

0.0427 

0.0107 

0.621 

0.660140 

Method 

2 

0.7960 

0.3980 

23.184 

0.000469  *** 

Residuals 

8 

0.1373 

0.0172 

Signif.  codes: 

0  '***'  0.001 

'***  0. 

01  0.05  0.1  11  '  1 

>1 


Interpretation: 

>  Since  the  p-value  for  analyst  is  0.66  which  is  greater  than  0.05,  we  accept  the  null 
hypothesis. 

There  is  no  significant  difference  in  part  per  million  defective  between  the  five  analysts. 

>  Since  the  p-value  for  method  is  0.000  which  is  less  than  0.05,  we  reject  the  null  hypothesis. 

Hence  there  is  a  significant  difference  in  part  per  million  defective  between  the  three 
methods. 
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LSD  (Latin  Square  Design) 


Exercise:  30 
Date:  7.02.2020 

To  analyze  the  productivity  of  5  kind  of  fertilizer^  kind  of  soil  and  5  kind  of  seed.  The  data  are 
organized  in  Latin  Square  design  as  follows. 


Soil  A 

Soil  B 

Soil  C 

Soil  D 

Soil  E 

Fertilizer  1 

A42 

C47 

B55 

D51 

E44 

Fertilizer  2 

E45 

B54 

C52 

A44 

D50 

Fertilizer  3 

C41 

A46 

D57 

E47 

B48 

Fertilizer  4 

B56 

D52 

E49 

C50 

A43 

Fertilizer  5 

D47 

E49 

A45 

B54 

C46 

Null  Hypothesis: 

Hoi:  There  is  no  significant  difference  in  productivity  among  different  fertilizers. 
H02:  There  is  no  significant  difference  in  productivity  among  different  soils. 

H03:  There  is  no  significant  difference  in  productivity  among  different  seeds. 

Alternative  Hypothesis: 

Hu:  There  is  a  significant  difference  in  productivity  among  different  fertilizers. 
Hi2:  There  is  a  significant  difference  in  productivity  among  different  soils. 

Ho:  There  is  a  significant  difference  in  productivity  among  different  seeds. 


R  Code  &  output: 

>  valuesoc  (42,47,55,51,44,45,54,52,44,50,41,46,57,47,43,56,52,45,50,43,47,49,45,54,46) 

>  f ertilizerC-c (rep ( "fertilizer ln f 5) f rep ( wf ertilizer2wr  5) T rep ( "fertilizer 3" t  5) f rep ( nf ertilizer4  \  5) r rep ( ni ertilizerS", 5) ) 

>  soil<-c  (rep  ( "soilA", 1)  r  rep  ( "soilB", 1)  r  rep  (  "soilC1*,  1)  r  rep  (wsoilDwr  1)  r  rep  ( "soilE"  f  1)  ) 

>  seeds<-c  ("A",  WCWF  "B"r  wDwr  nE\  "E"r  "B",wCwr  "A",  "D"  r  wCwr  wAwr  *DW,  "E",  "B"r  wBwr  "D"  f  "E"f  "C"  r  "A",  r  "E",  "A",  "B",  WCW) 

>  Data2<-data . frame (f ertilizer F  soil r  seeds r values) 

>  Threeway<-aov (values-f ertilizer+soil-seeds, Data2 ) 

>  summary (Three way) 


Df 

Sum  Sq 

Mean  Sq 

F  value 

Fr  (>F) 

fertilizer 

4 

17.76 

4.44 

0.797 

0.545839 

soil 

4 

109.36 

27.34 

4.906 

0.014105  * 

seeds 

4 

2S6.16 

71.54 

12.S36 

0.000271  * 

Residuals 

12 

66. 88 

5.57 

Signif.  codes: 

0  '*** 

tr  0.001 

o.< 

D1  0.05 

> 
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>  TulceyHSD  [  Three  way) 

Tukey  multiple  comparisons  of  ir.e ans 
95%  family- wise  confidence  level 


Fit:  aov  [formula  =  values  --  fertilizer  +  soil 


$t ertilizer 

fertilizerz-fertilizerl 
fertilizer3-fertilizerl 
fertilizer 4-fertilizer 1 
fertilizer 5-fertilizer 1 
fertilizer3-fertilizer2 
fertilize r4-fertilizer2 
fertilize r5-fertilizer2 
fertilizer 4-fertilizer 3 
fertilizer5-fertilizer3 
fertilizer 5-fertilizer 4 


diff 

iwr 

upr 

1.2 

-3.55914 

5. 95914 

0.0 

-4.75914 

4.75914 

2.2 

-2.55914 

6. 95914 

0.4 

-4.35914 

5.15914 

rj 

•— i 
i 

-5.95914 

3.55914 

1.0 

-3.75914 

5.75914 

-0.8 

-5.55914 

3. 95914 

2.2 

-2.55914 

6. 95914 

o 

-4.35914 

5.15914 

-1.8 

-6.55914 

2. 95914 

+  seeds,  data  =  Data2) 


P  *dj 
0.9245185 
1.0000000 
0.5965545 
0. 9986976 
0. 9245185 
0. 9593153 
0. 9816941 
0.5965545 
0. 9986976 
0.7485642 


Ssoil 


diff 

Iwr 

upr 

p  adj 

soilB-soilA 

3.4 

-1.3591399 

8.1591399 

0.2175142 

soilC-soilA 

5.4 

0.6408601 

10.1591399 

0.0239813 

soilD-soilA 

3.0 

-1.7591399 

7.7591399 

0.3181622 

soilE-soilA 

0.0 

-4.7591399 

4.7591399 

1.0000000 

soilC-soilB 

2.0 

-2.7591399 

6.7591399 

0.6738074 

soilD-soilB 

rp 

O 

I 

-5.1591399 

4.3591399 

0.9986976 

soilE-soilB 

-3.4 

-8.1591399 

1.3591399 

0.2175142 

soilD-soilC 

-2.4 

-7.1591399 

2 . 3591399 

0.5200518 

soilE-soilC 

-5.4 

-10.1591399 

-0.6408601 

0.0239813 

soilE-soilD 

Q 

! 

-7.7591399 

1.7591399 

0.3181622 

$seeds 

diff 

B-A 

9.4 

C-A 

3.2 

D-A 

7.4 

E-A 

2.8 

C-B 

-6.2 

D-B 

-2 . 0 

E-B 

-6.6 

D-C 

4.2 

E-C 

-0.4 

E-D 

-4.6 

Iwr 

4.6408601 
-1.5591399 

2.6408601 
-1. 9591399 

-10. 9591399 
-6.7591399 
-11.3591399 
-0.5591399 
-5.1591399 
-9.3591399 


upr 

14.1591399 

7.9591399 

12.1591399 
7.5591399 

-1.4408601 

2.7591399 

-1.8408601 

8. 9591399 
4.3591399 
0.1591399 


p  adj 
0.0003129 
0.2642252 
0.0025012 
0.3792691 
0.0095740 
0.6738074 
0.0060822 
0.0936276 
0.9986976 
0.0598894 
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Interpretation: 

>  Since  the  p-value  for  fertilizer  is  0.54  which  is  greater  than  0.05,  we  accept  the  null 
hypothesis. 

There  is  no  significant  difference  in  productivity  among  different  fertilizers. 


>  Since  the  p-value  for  soil  is  0.014  which  is  lesser  than  0.05,  we  reject  the  null  hypothesis. 
There  is  a  significant  difference  in  productivity  among  different  soils. 

Soil  A-C  and  E-C  has  a  significant  difference  among  the  other  soils. 


>  Since  the  p-value  for  seeds  is  0.0002  which  is  lesser  than  0.05,  we  reject  the  null  hypothesis. 
There  is  a  significant  difference  in  productivity  among  different  seeds. 

Seeds  B-A,  D-A,  C-B,  E-B  has  a  significant  difference  among  the  other  seeds. 


AJEETH.H  |  17-UST-046  |  PG  &  RESEARCH  DEPARTMENT  OF  STATISTICS,  LOYOLA  COLLEGE,  CHENNAI-34 


56 


Mann-Whitney  U  Test 


Exercise:  31 
Date:  10.02.2020 

A  large  corporate  hospital  hires  most  of  its  doctors  from  two  major  universities. 

Over  the  last  year,  hospital  has  been  conducting  test  for  the  newly  recruited  doctors  to  determine 
which  university  educates  better.  Based  on  the  following  scores,  help  the  human  resource 
department  of  the  hospital  to  decide  whether  the  universities  differ  in  quality. 

Use  Mann-Whitney  U  test. 


University  A 

99 

83 

89 

64 

98 

85 

61 

79 

91 

87 

88 

University  B 

96 

90 

97 

94 

86 

95 

68 

78 

93 

56 

76 

84 

Null  Hypothesis: 

H0:  There  is  no  significant  difference  between  the  scores  of  University  A  and  University  B. 

Alternative  Hypothesis: 

Hi:  There  is  a  significant  difference  between  the  scores  of  University  A  and  University  B. 

R  Code  &  output: 

>  A<-c (99, 33 r 39, 64 , 93 r 35 , 61,79,91,37,33) 

>  B<-c  (96,90,97,94,86,95,63,73,93,56,76,34) 

>  wilcox . test (Ar B) 

Wilcoxon  rank  sum  test 

data :  A  and  B 

W  =  64,  p-value  =  0.9279 

alternative  hypothesis:  true  location  shift  is  not  equal  to  0 


Interpretation: 

>  Since  the  p-value  for  fertilizer  is  0.9279  which  is  greater  than  0.05,  we  accept  the  null 
hypothesis.  There  is  no  significant  difference  between  the  scores  of  University  A  and 
University  B. 
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WILCOXON  SIGNED  RANK  TEST 


Exercise:  32 
Date:  12.02.2020 

Performance  scores  of  16  salesmen  before  and  after  training  are  given  below: 


Scores  Before 
Training 

85 

76 

64 

59 

72 

68 

43 

54 

57 

61 

71 

82 

39 

51 

54 

57 

Scores  after 
training 

82 

79 

68 

52 

75 

69 

40 

53 

50 

67 

71 

83 

54 

59 

51 

58 

At  5%  level  of  significance,  test  the  hypothesis,  using  wilcoxon  test,  that  the  training  has  not  caused 
any  change  in  the  performance  score. 

Null  Hypothesis: 

H0:  There  is  no  significant  difference  between  the  scores  of  before  and  after  training. 

Alternative  Hypothesis: 

Hi:  There  is  a  significant  difference  between  the  scores  of  before  and  after  training. 

R  code  and  output: 

>  bef ore<-c (35, 76, €4 , 55 r 72 f 63  „ 43, 54 f 57 r 61 r 71 r 32 r 35, 51 r 54 , 57) 

>  after<-c (32r IS, 63 r 52 r 75 r 6Sr 40 r S3, 50, 67, 71 r 33 r 54 r 5S, 51 r 58> 

>  wilcox . test (before , after r pair ed=T r alt= n less  n ) 

Wilcoxon  signed  rank  test  with  continuity  correction 

data:  before  and  after 

V  =  43.5,  p-value  =  0.2643 

alternative  hypothesis:  true  location  shift  is  less  than  0 


Interpretation: 

Since  the  p  value  is  0.2648  which  is  less  than  0.05,  we  reject  the  null  hypothesis.  Hence  there  is  no 
significant  difference  between  the  scores  of  before  and  after  training. 
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KRUSKAL  WALLIS  TEST 


Exercise:  33 
Date:  12.02.2020 

A  departmental  store  has  shops  at  three  different  locations  in  the  city.  The  owner  keeps  a  daily 
record  for  each  location  of  the  number  of  customers  who  actually  make  a  purchase.  A  sample  of 
those  data  as  follows.  Using  Kruskal  Wallis  test,  can  you  say  at  5%  level  of  significant  the  shops 
have  the  same  number  of  customer  who  buy? 


Location  A 

99 

64 

101 

85 

79 

88 

97 

95 

90 

100 

Location  B 

83 

102 

125 

61 

91 

96 

94 

89 

93 

75 

Location  C 

89 

98 

56 

105 

87 

90 

87 

101 

76 

89 

Null  Hypothesis: 

Ho:  There  is  no  significant  difference  between  the  three  locations. 

Alternative  Hypothesis: 

Hi:  There  is  a  significant  difference  between  the  three  locations. 


R  code  and  output: 

>  locationA<-c (99,  6^ , 101 , 35,  79, 33 , 97 , 95 , 90 , 100) 

>  locatior.EK-c  (33,  102  ,  12  5,  61,  91,  96  ,  94  ,  39, 93,  75) 

>  locationC<-c (39,  93,  56, 105 r  37  r 90 , 37 f 101 r  76 f 39) 

>  Ifrustal .  test  (list  (locationAr  locatior.Bf  locationC)  ) 

Kruskal-Wallis  rank  suit;  test 
data:  list (locationA,  locations,  locationC) 

Kruskal -Wall is  chi-squared  =  Q. 19643,  df  =  2,  p-value  =  0.9065 


Interpretation: 

Since  the  p  value  is  0.9065  which  is  greater  than  0.05,  we  accept  the  null  hypothesis.  Hence  there  is 
no  significant  difference  between  the  three  locations. 
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CORRELATION 


Exercise:  34 
Date:  13.02.2020 
AIM: 

To  create  R  coding  to  find  Pearson  correlation  and  their  testing. 

Null  Hypothesis: 

Ho:  There  is  no  correlation  between  miles  per  gallon  and  horse  power. 

Alternative  Hypothesis: 

Hi:  There  is  a  correlation  between  miles  per  gallon  and  horse  power. 

R  code  and  output: 

>  cor  (mt;cars$isipg,mtcars$hp,  niethod="pearsoiiw ) 

[1]  -0.7761684 

> 

>  cor  .  test  (iKitcars$mpgf  mtcars?hpf  method=npearsor_ n  ) 

Fear  son1  s  product-m.om.ent  correlation 

data:  ir.tcars$mpg  and  m.tcars$hp 

t  =  -6.7424,  df  =  30,  p- value  =  1 . 783e-Q7 

alternative  hypothesis:  true  correlation  is  not  equal  to  0 
95  percent  confidence  interval : 

-0.8852686  -0.5860994 
sample  estimates: 
cor 

-0.7761684 

Interpretation: 

We  leamt  the  R  coding  for  Pearson  correlation  correlation. 

The  Pearson  correlation  coefficient  between  miles  per  gallon  and  horse  power  is  -0.7761684  (High 
negative  Correlation).  This  shows  that  the  car  with  high  horse-power  produces  less  mpg.  The  p 
value  for  testing  correlation  (p  =  0)  is  0.0000001788  is  less  than  0.01  which  indicates  that  there  is  a 
highly  significant  negative  correlation  between  horse  power  and  miles  per  gallon. 
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SPEARMAN  AND  KENDALL  CORRELATION 


R  code  and  output: 


Candidate 

Interviewer  1  Interviewer  2 

A 

1 

1 

B 

2 

2 

C 

3 

4 

D 

4 

3 

E 

5 

6 

F 

6 

5 

G 

7 

S 

H 

S 

7 

1 

9 

10 

J 

10 

9 

K 

11 

12 

L 

12 

11 

>  intl<-c (lr  2r  3,  4,  5,  6,  7,  3,  9, 10, II r 12) 

>  int2<-c  (lr  2  r  4,  3,  6,  5,  3r  7, 10,  9,  12  r  11) 

>  cor  (ir.tlr  ir.t2  r  met hod="  spearman m ) 

[1]  0. 965035 

>  I 

Interpretation: 

Since  the  p-value  is  0.96  which  is  greater  than  0.05  we  accept  the  null  hypothesis.  There  is  no 
correlation  between  interviewer  1  and  interviewer  2. 
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FITTING  OF  SIMPLE  LINEAR  REGRESSION  MODEL 


Exercise:  35 
Date:  14.02.2020 
Problem: 

A  sales  Manager  collected  the  following  data  on  annual  sales  and  years  of  experience. 


S.NO 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

Years  of 
experience 

1 

3 

4 

4 

6 

8 

10 

10 

11 

13 

Annual  Sales 
(1000$) 

80 

97 

92 

102 

102 

111 

119 

123 

117 

136 

Aim:  To  fit  a  Linear  Regression  Model  for  the  given  data 

Null  Hypothesis: 

H0:  The  fitted  model  is  not  significant. 

Alternative  Hypothesis: 

Hp  The  fitted  model  is  significant. 

R  Code  and  Output: 


>  # - FITTING  CF  SIMPLE  LINEAR  REGRESSION  MODEL - # 

>  # - GETTING  INPUT  DATA - # 

>  salespers on=c (1:10) 

>  yearsofexperience=c (lF3r4r4r6r8r10F10rllr13) 

>  annua Isale s= c (30r  97, 52  r 102  r 103 r 111 r 119 r 12 3 r 117 r 136 ) 

>  # - GETTING  DATA  OUTPUT - # 

>  salesdetails=data  .  frame  [salesperson,  yearsofe^perier.ce,  amialsales) 

>  salesde tails 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 


salesperson 

1 

2 

3 

4 

5 
■6 
7 
S 
9 

10 


ye  a  r  s  o  f  e  xp  e  r ienc e 
1 

3 

4 
4 
€ 
8 

10 

10 

11 

13 


annua 1 s  a 1 e  s 
80 
97 
92 
102 
103 
111 
119 
123 
117 
136 
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SCATTER  PLOT: 


>  plot [yearsof experience, annualsales fmain=wScatter  Plot") 


Scatter  Plot 


w 

jO? 

CD 

C/3 


6  8 

ye  a  rs  of  exp  e  ri  e  n  ce 


10 


12 


CORRELATION: 


>  cor . test ( year sof experience r annualsales) 

Fear  son1  s  pro  duct -moment  correlation 


data:  yearsof experience  and  annualsales 

t  =  10.34f  df  =  8,  p-value  =  6.6Q9e-G6 

alternative  hypothesis:  true  correlation  is  not  equal  to  G 
95  percent  confidence  interval: 

G . 852944 6  G. 9918346 
sample  estimates: 
cor 

0. 9645646 
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>  £ - REGRESSION  FITTING - # 

>  regmodei=lm ( armualsales~yearsof experience ) 

>  summary (regmodel) 

Call : 

lir.  (f  orir.ula  =  annualsales  --  yearsof  experience) 
Residuals : 

Min  IQ  Median  3Q  Max 

-7 . 00  -3.25  -1.00  3.75  6.00 


Coefficients : 

Estimate  Std.  Error  t  value  Fr(>|t|} 
(Intercept}  80.0000  3.0753  26.01  5.12e-09  *** 

yearsof experience  4.0000  0.3868  10.34  6 . 61e-06  *** 

Signif.  codes:  0  '***'  0.001  0.01  **r  0.05  ".f  0.1  1  r  1 


Residual  standard  error:  4.61 
Multiple  R-squared:  0.9304,, 
F-statistic:  106.9  on  1  and  8 


on  8  degrees  of  freedoir. 

Adjusted  R-squared:  0.9217 
DF„  p -value :  6.609e-06 


Interpretation: 

We  learned  the  R  coding  for  fitting  simple  linear  regression  model. 

1 .  From  the  scatter  plot,  we  can  conclude  that  annual  sales  and  year  of  experience  has  linear 
relationship. 

2.  The  best  fitted  simple  linear  regression  model  is  Y=80+4X. 

3.  92.17%  variations  in  Y  are  explained  by  the  variable  X. 

4.  There  are  $80,000  annual  sales  without  years  of  experience. 

5.  Since  p-value  <0.05  (0.0000066090.05)  at  5%  level  we  reject  Ho  and  conclude  that  the 
fitted  model  is  significant. 
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FITTING  OF  MULTIPLE  LINEAR  REGRESSION  MODEL 


Exercise:  36 
Date:  17.02.2020 


The  owner  of  show  time  movie  theatre  would  like  to  estimate  weekly  gross  revenue  as  a  function  of 
advertising  expenditures.  Historical  data  for  sample  of  eight  week  as  follows: 


Weekly  Gross 
Revenue($1000) 

96 

90 

95 

92 

95 

94 

94 

94 

TV 

advertisement^  1 000) 

5.0 

2.0 

4.0 

2.5 

3.0 

3.5 

2.5 

3.0 

News  paper 
advertisement^  1 000) 

1.5 

2.0 

1.5 

2.5 

3.3 

2.3 

4.2 

2.5 

Aim: 

To  fit  a  multiple  linear  regression  model  for  the  given  data. 


Null  Hypothesis: 

H0:  The  fitted  model  is  not  significant. 
Alternative  Hypothesis: 

Hp  The  fitted  model  is  significant. 


R  Code  and  Output: 
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>  y<-c (96F90F95F92,95F94F94F94)' 

>  xl<-c (5,2F4F2.5F3F3.5F2.5F3) 

>  x2<-c (1.5F2,1.5F2.5F3.3F2.3F4.2F2.5) 

>  datac-data . frame (yrxirx2 ) 

>  re g= lm ( y~xl 4x2 ) 

>  summary (reg) 

Call: 

Ira  [formula  =  y  ~  xl  4  x2) 

Residuals : 

12345-678 
-0.6325  -0.4124  0.6577  -0.2080  0 . 6061  -0.2380  -0.4197  0.6469 


Coefficients : 


[Intercept) 

xl 

x2 


Estimate  5td.  Error 
83.2301  1.5739 

2.2902  0.3041 

1.3010  0.3207 


t  value  Fr  (> 1 1 1 } 
52.882  4.57e-08  *** 
7.532  0.000653  *** 
4.057  0.009761  ** 


Signif.  codes:  0  0.001  '**r  0.01  1 A f  0.05  0.1  1  f  1 

Residual  standard  error:  0.6426  on  5  degrees  of  freedom. 
Multiple  R-squared:  Q.919r  Adjusted  R-squared:  0.8866 

F-statistic:  28.38  on  2  and  5  DF,  p -value :  0.001865 


Interpretation: 


We  learnt  the  R  coding  for  fitting  Multiple  Linear  regression  Model 

1 .  The  best  fitted  Multiple  linear  regression  model  is 
Y=83.2301+2.2902xl+1.3010x2 

2.  88%  variations  in  Y  are  explained  by  the  variables  xl  &  x2. 

3.  There  are  $83230.1  sales  without  xl  and  x2. 

4.  Since  p  value  <  0.05(0.00 18650. 05)  at  5%  level  we  reject  H0  and  conclude  that  the  fitted 
model  is  significant. 
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SIMPLEX  METHOD 


Exercise:  37 


Date:  19.02.2020 


Solve  the  following  LPP  by  using  Simplex  method. 


Max  Z  =  22xi  +  30x2  +  25x3 


Subject  to  the  constraints 


2xi  +  2x2  +  0x3  <100 


2xi  +  X2  +  X3  <  100 


xi  +  2x2  +  2x3  <  100  ;  xi,X2,X3  >  0 


Aim:  To  create  R  coding  to  solve  the  LPP  by  simplex  method. 


R  code  and  Output: 


>  library (boot ) 


>  # 


SIMPLEX  METHOD- 


>  ob;ective<-c (22 , 30, 25) 

>  consK-c  (2  r  2  ,  0) 

>  cors2<-c (2 r I, 1) 

>  cons3<-c ( 1 , 2 , 2 ) 

>  simplex  (a=objectiver  Al=rbiixd  (conslr  cons 2  ,  cons3)  r  bl=c  ( 100 r  100  r  100 )  fmaxi=T) 
Linear  Programming  Results 

Call  :  simplex (a  =  objective,  A1  =  ibind  (consl ,  cons2,  cons  3}  ,  bl  =  c  ( 1 0  0 , 
100,  100)-,  maxi  =  T) 

Maximization  Problem  witb  Objective  Function  Coefficients 
xl  x2  x3 
22  30  25 


Optimal  solution  lias  the  following  values 
xl  x2  x3 

33.33333  16.66667  16.66667 

The  optimal  value  of  the  objective  function  Is  1650. 


Interpretation: 


We  leamt  the  R  coding  for  solving  LPP  by  simplex  method. 


The  optimal  value  of  the  objective  function  is  Max  Z=  1650  with  the  solution. 


\i=  33.34,  X2=16.67,  X3  =  16.67 
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Big  M  -  Method 


Exercise:  38 
Date:  20.02.2020 

Solve  the  following  LPP  by  Big  M  method,  Max  Z  =  5xxi  -  6x2  -  7x3 
Subject  to  the  constraints  xi  +  5x2  -  3x3  >15 

5xi  -  6x2  +  10x3  >  0 
xi  +  x2  +  x3  =  5,  xi  x2,x3  >  0 

Aim:  To  create  R  coding  to  solve  the  LPP  by  Big  M  method. 

R  code  &  Output: 

>  # - BIG  M  METHOD - # 

>  library (boot) 

>  ob jective<-c (5f -6f -7 ) 

>  cor.3l<-c  (ir  5r -3) 

>  cons2<-c (5r -€r 10) 

>  cons3<-c (Ir Ir 1) 

>  simplex  (a=ob jective,  Al=rbind  (cor.3lr  cor.s2  )  rbi=c(15r0)  f  A3=cor_s3  r  b3=5  f  maxi=T ) 
Linear  FrogramirJ.ng  Results 

Call  :  siirplex(a  =  objective^  A1  =  rbind  ( consl  f  cons  2}  f  bl  =  c(15,r  0}  f 
A3  =  cons 3,  b3  =  5  f  maxi  =  T) 

Maxim.ization  Froblem.  with  Objective  Function  Coefficients 
xl  x2  x3 
5  -6  -7 


Cptim.al  solution  has  the  following  values 
xl  x2  x3 

2.727273  2.272727  0.000000 

The  optim.al  value  of  the  objective  function  is  0. 

>  I 

Interpretation: 

We  leamt  the  R  coding  for  solving  LPP  by  Big  M  method. 

The  optimal  value  of  the  objective  function  is  Max  Z=  0  with  the  solution. 


xi=  2.7,  x2=2.27,  x3  =  0 
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TWO  PHASE  SIMPLEX  METHOD 

Exercise:  39 
Date:  21.02.2020 

Solve  the  following  LPP  by  two  phase  simplex  method: 

Max  Z  =  12xi+  15x2+  9x3 

Subject  to  the  constraints 
8xi  +  16x2  +  12x3  ^  250 


4x!  +  8x2+  10x3>80 


7xi+9x2+8x3  =  10  ,  xi;X2,x3>0 

Aim: 

To  create  r  coding  to  solve  the  LPP  by  Two  phase  simplex  method. 

R  Code  and  Output: 

>  # - TWO  PHASE  METHOD - # 

>  Library [boot ) 

>  ob jective<-c (12,  15 r 9} 

>  cor.si<-c  £3,16,12) 

>  cor.s2<-c  , 3 , 10) 

>  cor.s3<-c  £7, 9, 3) 

>  simplex  ( a=ob:  ective ,  Al=cor.sl ,  bl=2  50 ,  A2  =  cor.s2  ,  b2  =  3  0 ,  A3=cor.s3 ,  b3=LG5 ,  maxi=T ) 
Linear  Programming  Results 

Call  :  simplex  (a  =  objective,  A1  =  consl,  bl  =  2  50,  A2  =  cons 2 ,  b2  =  SO, 

A3  =  cons3,  b3  =  105,  maxi  =  T} 

Maxim.ization  Problem  with  Objective  Function  Coefficients 
xl  x2  x3 
12  15  9 


Optimal  solution  has  the  following  values 
xl  x2  x3 
6  7  0 

The  optimal  value  of  the  objective  function  is  177. 
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Interpretation: 

We  learned  R  coding  for  solving  LPP  by  Two  phase  simplex  method. 
The  optimal  value  of  the  objective  function  is  Max  Z  =  177  with  the  solution 

xi  =6,  x2  =7,  x3  =  0 


Question: 

A  company  produces  two  models  of  chair:  4P  and  3P.  The  Model  4P  needs  4  legs,  1  seat  and  back. 
On  the  other  hand,  the  model  3P  needs  3  legs  and  1  seat.  The  company  has  an  initial  stock  of  200 
legs,  500  seats  and  100  backs.  If  the  company  needs  more  legs,  seats  and  backs,  it  can  buy  standard 
wood  blocks,  whose  cost  is  80  Rs  per  block.  The  company  can  produce  10  seats,  20  legs  and  2 
backs  from  a  standard  wood  block.  The  cost  of  producing  the  model  4P  is  30  Rs/chair,  meanwhile 
the  cost  of  producing  the  model  3P  is  40  Rs/chair.  Finally,  the  company  informs  that  the  minimum 
number  of  chairs  to  produce  is  1000  units  per  month.  Define  a  linear  programming  model,  which 
minimizes  the  total  cost  (the  production  costs  of  the  two  chairs,  plus  the  buying  of  new  wood 
blocks). 
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Transportation  Problem 


Exercise:  40 
Date:  24.02.2020 

Solve  the  following  transportation  problem, 


Origins 

Destinations 

Supply 

1 

2 

3 

4 

A 

15 

12 

42 

33 

23 

B 

80 

42 

26 

81 

44 

C 

90 

40 

66 

60 

33 

Demand 

23 

31 

16 

30 

100 

Aim: 

To  create  R  coding  to  solve  the  Transportation  problem. 

R  code  &  Output: 

>  # - Transportation  problem - # 

>  cost<-matrix  (c  (15, 12 , 42,  33, 80,  42, 26, 81,  90, 40, 66, 6Q)  ,nrow=3,ncol=4  , byrow=T) 

>  rowsignsc-rep  (  n  =  n  ,  3) 

>  rowvalues<-c (23, 44 , 33) 

>  colsigns<-rep ( n=n , 4) 

>  coivalues<-c (23, 31, 16, 30) 

>  f  it<-lp  .  transport  (cost,  nmir_n  ,  rowsigns,  rowvalues,  colsigns,  colvalues) 

>  f it$solution 

1,13  1,2]  1,3]  1,4] 

11,  3  23  0  0  0 

12.3  0  23  16  0 

13.3  0  3  0  30 

>  fit 

Success:  the  objective  function  is  3857 

>  I 


Interpretation: 

We  leamt  the  R  coding  for  solving  Transportation  problem. 

The  optimal  allocations  for  the  given  transportation  problem  is  as  follows, 
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Origin 

Destination 

Number  of  units  allotted 

A 

1 

23 

B 

2 

28 

B 

3 

16 

C 

2 

3 

C 

4 

30 

The  minimum  transportation  cost  is  Rs.3857. 
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Assignment  Problem 

Exercise:  41 
Date:  25.02.2020 

Five  jobs  have  to  be  allotted  to  five  machines,  such  that  the  total  time  taken  to  perform  all  the  jobs 
are  minimum.  Make  the  optimum  allotment  of  the  jobs  to  the  machines. 


Machine^^\ 

I 

II 

III 

IV 

V 

A 

11 

17 

18 

16 

20 

B 

9 

7 

12 

6 

15 

C 

13 

16 

15 

12 

16 

D 

21 

24 

17 

28 

26 

E 

14 

10 

12 

11 

15 

Aim: 


To  create  R  coding  to  solve  the  assignment  problem. 


R  code  &  Output: 


>  f - Assignment  problem - # 

>  timeC-matzix  (c (11, 17, 18, 16, 2Q, 09, 7, 12, 6, 15, 13, 16, 15, 12, 16, 21,24, 17,23,26, 14, 10, 12 , 11, 15} ,nrow=5,ncol=5,byrow=T} 

>  f it<-lp . assign [time, direction="min" ) 

>  fit$solution 


[1,  ] 
12,] 
[3,] 
[4,1 
15,] 


[,1]  [,2]  [, 3]  1,4]  [ , 5] 

1  0  0  0  0 

0  0  0  1  0 

0  0  0  0  1 

0  0  10  0 

0  10  0  0 


>  fit 


Success:  the  objective  function  is  60 


Interpretation: 


We  leamt  the  R  coding  for  solving  Assignment  problem. 


The  optimal  allocations  for  the  given  assignment  problem  is  as  follows, 
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Job 

Machine 

Time  to  complete  the  job 

I 

A 

11 

II 

E 

10 

III 

D 

17 

IV 

B 

6 

V 

C 

16 

The  minimum  time  to  complete  the  job  is  60. 
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X  and  R  chart 


Exercise:  42 
Date:  27.02.2020 


Construct  X  and  R  chart  for  the  following  data. 


Sample 

XI 

X2 

X3 

X4 

X5 

1 

83 

79 

81 

82 

83 

2 

83 

81 

85 

87 

81 

3 

85 

87 

83 

84 

86 

4 

80 

81 

83 

84 

83 

5 

83 

84 

85 

83 

84 

6 

88 

87 

89 

90 

88 

7 

80 

81 

82 

84 

81 

8 

79 

89 

88 

89 

89 

9 

78 

83 

85 

86 

93 

10 

88 

83 

82 

85 

82 

11 

78 

80 

78 

82 

81 

12 

81 

85 

85 

85 

84 

13 

77 

82 

84 

85 

87 

14 

81 

85 

85 

85 

84 

15 

85 

87 

82 

85 

89 

16 

83 

83 

77 

81 

80 

17 

85 

84 

84 

80 

82 

18 

82 

83 

80 

80 

83 

19 

75 

77 

84 

77 

78 

20 

85 

85 

86 

83 

80 
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>  qc<-read  .  table  (  "U  :  \ \ugst^  6\ \x-bar  and.  R  chant  data  .  csv"  f  header=Tf  sep=n  r  n  ) 


>  qc 


5  air.p  1  e 

XI 

X2 

X3 

X4 

X5 

1 

.1 

83 

79 

81 

82 

83 

2 

2 

83 

81 

85 

87 

81 

3 

3 

85 

87 

S3 

84 

86 

4 

1 

80 

81 

83 

34 

83 

5 

5 

S3 

84 

85 

83 

84 

6 

6 

88 

87 

89 

90 

88 

7 

7 

80 

81 

82 

84 

81 

s 

s 

79 

89 

88 

89 

89 

9 

9 

78 

S3 

85 

86 

93 

10 

10 

88 

83 

82 

85 

32 

11 

11 

78 

80 

78 

82 

81 

12 

12 

81 

85 

85 

85 

84 

13 

13 

77 

82 

34 

85 

37 

11 

14 

81 

85 

85 

85 

34 

15 

15 

85 

87 

82 

85 

89 

16 

16 

S3 

83 

77 

81 

30 

17 

.17 

85 

84 

84 

80 

82 

18 

18 

82 

83 

SO 

SO 

83 

19 

19 

75 

77 

84 

77 

78 

20 

20 

85 

85 

86 

83 

80 

xbar  Chart 
for  pistonring 


Cfi 

O 


CD 

~tn 

CD 


tfi 

Q_ 

=3 

O 

5 


UCL 


CL 


LCL 


Group 

Number  of  groups  =  20 

Center  =  83  28  LCL  =  79.93455 

StdDev  =  2.493551  UCL  =  86.62545 


Number  beyond  limits  =  4 
Number  violating  runs  =  0 
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>  p i s t onr ing< -  qc  [ ,  - 1  ] 

>  # - X  Bar  Chart - # 

>  qcc (pistonring, type= "xtoar n , nsigma=3) 

List  of  11 

$  call  :  language  qcc  (data  =  pistonring,  type  =  "xfcar",  nsigra.as  =  3) 

S  type  :  chi  "xfcar" 

$  data.narae  :  chr  rrpis tonring" 

5  data  :  int  [1:20,  1:5]  83  83  85  80  83  88  80  79  78  88  ... 

..-  attr  (fr,  "dimnames"}=List  of  2 

$  statistics:  Narr.ed  num  [1:20]  81.6  83.4  85  82.2  83.8  88.4  81.6  85.8  85  84  ... 

..-  attr  ( *,  "names pr }  =  chr  [1:20]  rrlrr  rr2rr  "3"  "4"  ... 

S  sizes  :  int  [1:20]  5555555555  ... 

$  center  :  nura  83 . 3 

S  std.dev  :  nura  2.49 

$  nsigra.as  :  nura  3 

$  limits  :  nura.  [1,  1:2]  79.9  86.5 

..-  attr  (*,  "dira_nara.es")  =List  of  2 
$  violations : List  of  2 
-  attrf*,  "class")=  chr  rrqcc" 

>  # - R  Chart - # 

>  qcc (pistonring, type=nRn ,nsigma=3) 

List  of  11 

$  call  :  language  qcc (data  =  pistonring,  type  =  "R",  nsigra.as  =  3) 

$  type  :  chr  "R" 

$  data.nara.e  :  chr  "pistonring" 

$  data  :  int  [1:20,  1:5]  S3  83  85  80  83  38  80  79  78  38  ... 

,  ,  -  attr  (  "dira_nara.es")  =List  of  2 
$  statistics:  Named  int  [1:20]  4  5  4  4  2  3  4  10  15  5  ... 

..-  attr  ( *,  "nara.es")  =  chr  [1:20]  "1"  "2"  "3"  "4"  ... 

$  sizes  :  int  [1:20]  5555555555  ... 

$  center  :  nura,  5.8 

$  std.dev  :  nura  2.49 

$  nsigra.as  :  nura.  3 

$\  limits  :  nura.  [1,  1:2]  0  12.3 


* 
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R  Chart 
for  pistonring 


UCL 


CL 


LCL 


Group 

Number  of  groups  =  20 

Center  =  5.8  LCL  =  0  Number  beyond  limits  =  1 

StdDev  -  2.493551  UCL  -  1 2.26392  Number  violating  runs  -  0 


Interpretation: 

We  learnt  the  R  coding  to  construct  X  bar  and  R  chart  for  the  given  data. 

1 .  The  control  limit  for  X-bar  chart  are  LCL=79. 93455  and  UCL=86.665.  From  the  control 
chart,  the  four  sampling  observation  6,8,1 1  and  19  are  falling  outside  of  the  UCL  and  LCL. 
So  we  conclude  that  the  process  is  out  of  control. 

2.  The  control  limit  for  R  chart  are  LCL=0  and  UCL=12. 26392.  From  the  control  chart,  the  9th 
sampling  observation  falling  outside  of  the  LCL  and  UCL.  So  we  conclude  that  the  process 
is  out  of  control. 
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X  and  S  chart 


Exercise:  43 
Date:  27.02.2020 


Construct  X  and  S  chart  for  the  following  data. 


Sample 

XI 

X2 

X3 

X4 

X5 

X6 

1 

63 

55 

56 

53 

61 

64 

2 

60 

63 

60 

65 

61 

66 

3 

57 

60 

61 

65 

66 

62 

4 

58 

64 

60 

61 

57 

65 

5 

79 

68 

65 

61 

74 

71 

6 

55 

66 

62 

63 

56 

52 

7 

57 

61 

58 

64 

55 

63 

8 

58 

51 

61 

57 

66 

59 

9 

65 

66 

62 

68 

61 

67 

10 

73 

66 

61 

70 

72 

78 

11 

57 

63 

56 

64 

62 

59 

12 

66 

63 

65 

59 

70 

61 

13 

63 

53 

69 

60 

61 

58 

14 

68 

67 

59 

58 

65 

59 

15 

70 

62 

66 

80 

71 

76 

16 

65 

59 

60 

61 

62 

65 

17 

63 

69 

58 

56 

66 

61 

18 

61 

56 

62 

59 

57 

55 

19 

65 

57 

69 

62 

58 

72 

20 

70 

60 

67 

79 

75 

68 
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I 

> 

>  # - X  Bar  Chart - # 

>  qcc  (transact! on t ime  ,  type=nxbar n  ,  r.sigma=3) 

List  of  11 


$  call  :  language  qcc  (data  =  transactiontime,  type  =  prxtarpr,  nsigir.as  =  3) 

$  type  :  chr  "xhar" 

$  data. name  :  c'nr  rrtransactiontiir.err 

$  data  :  int  [1:20,  1:6]  63  60  57  53  79  55  57  53  65  73  . . . 

attr  (*,  "dimnames"}=List  of  2 

$  statistics:  Named  num  [1:20]  53.7  62.5  61.3  60.3  69.7  ... 

attr(*,  "names" }=  chr  [1:20]  rrlrr  rr2rr  rr3rr  "4"  ... 

S  sizes  :  int  [1:20]  6  6  6  6  6  6  6  6  6  6  ... 

$  center  :  num  63 

$  std.dev  :  num  4.63 

$  nsigir.as  :  num  3 

$  limits  :  num  [1,  1:2]  57.3  68.7 
attr(*,  "dimnames")  =List  of  2 
$  violations : List  of  2 
-  attr(A,  "class") =  chr  "qcc" 

>  # - S  Chart - # 


>  qcc ( t r ansae tiont ime, type=wSw,nsigma=3) 
List  of  11 


$ 

>~l 


call 

type 

data .  nam.e 
data 

. -  attr  { *w 
statistics 
.-  attr  [* r 
sizes 
center 
std. dev 
ns  igm.as 
limits 
.-  attr  [* r 
violations 


language  qcc  (data  =  transactiontim.e,  type  =  "5",  nsigir.as  =  3 ) 
chr  "5" 

chr  " tr ansae t ion time" 

int  [1:20,  1:6]  63  60  57  53  79  55  57  53  65  73  . . . 

"dim_nam.es  ")=L  1st  of  2 

Named  num  [1:20]  4.59  2.59  3.31  3.19  6.44  ... 

"names") =  chr  [1:20]  "1"  "2"  "3"  "4"  ... 

int  [1:20]  6  6  6  6  6  6  6  666  ... 
num  4.45 
num  4.63 
num  3 

num  [1,  1:2]  0.135  3.774 
" dimnam.e s  " )  =Li s t  of  2 
List  of  2 


-  attr(A,  "class") =  chr  "qccr 
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xbar  Chart 
for  transactiontime 


UCL 


CL 


LCL 


Group 

Number  of  groups  =  20 

Center  -  63.00833  LCL  -  57.28094  Number  beyond  limits  -  4 

StdDev  =  4.676401  UCL  =  68.73573  Number  violating  runs  =  0 
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S  Chart 

for  transactiontime 


UCL 


CL 


LCL 


Group 

Number  of  groups  -  20 

Center  =  4.454429  LCL  =  0.1 352508  Number  beyond  limits  =  0 

StdDev  -  4.68132  UCL  -  8.773608  Number  violating  runs  -  0 


Interpretation: 

We  learnt  the  R  coding  to  construct  X  bar  and  S  chart  for  the  given  data. 

1 .  The  control  limit  for  X-bar  chart  are  LCL=57.28  and  UCL=68.73.  From  the  control  chart, 
the  four  sampling  observation  5,10,15  and  15th  are  falling  outside  of  the  UCL  and  LCL.  So 
we  conclude  that  the  process  is  out  of  control. 

2.  The  control  limit  for  R  chart  are  LCL=0.13  and  UCL=8.77.  From  the  control  chart  all  the 
sampling  observation  falling  inside  of  the  LCL  and  UCL.  So  we  conclude  that  the  process  is 
in  control. 
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np  Chart 

Exercise:  44 
Date:  28.02.2020 


Consider  the  data  on  Number  of  defectives  in  sample  of  1000  ceramic  substances. 


Sample 

i 

2 

3 

4 

5 

6 

7 

8 

9 

10 

Defectives 

33 

37 

21 

39 

18 

20 

35 

41 

33 

37 

n 

100 

100 

100 

100 

100 

100 

100 

100 

100 

100 

Sample 

ii 

12 

13 

14 

15 

16 

17 

18 

19 

20 

Defectives 

25 

41 

24 

30 

31 

19 

35 

27 

15 

19 

n 

100 

100 

100 

100 

100 

100 

100 

100 

100 

100 

Construct  control  chart  for  number  of  defectives  based  on  the  given  data. 


Aim: 

To  create  R  coding  to  construct  np  chart  for  the  given  data. 

R  code  and  Output: 


>  # - np  chart - # 

>  defectives<-c ( 33, 37, 21, 39, 13, 20, 35, 41, 33, 37, 25, 41, 24 , 30, 31, 19, 35, 27, 15, 19) 

>  samplesize<-rep (100, 20) 

>  ceramic<-data . frame (defectives, samplesize) 

>  qcc (ceramic? def ectives,  sizes=ceramic?samplesize, type=nr.pn ) 

List  of  11 

?  call  :  language  qccfdata  =  ceramic?def ectives,  type  =  rrnprr,  sizes  =  cezamic$samplesize} 

?  type  :  chr  "np" 

$  data. name  :  chr  "ceramic? defectives" 

?  data  :  num  [1:20,  1]  33  37  21  39  IS  20  35  41  33  37  ... 

..-  attr  r  "dim_nam.es")  =List  of  2 
5  statistics:  Named  num  11:20]  33  37  21  39  18  20  35  41  33  37  ... 

..-  attr  ( * ,  "names" }=  chz  [1:20]  "1"  "2"  "3"  "4"  ... 

5  sizes  :  num  [1:20]  100  10 0  100  100  100  100  100  100  100  100  ... 

S  center  :  num  29 

?  std.dev 


s  nsigm.as 
S  limits 
..-  attrf*, 

5  violations : List  of  2 
-  attr  "class") =  chr 

>  i 


num  4.54 
num  3 

num  [1,  1:2]  15.4  42. 
"dim_nam.es  "}=Li st  of  2 


rqccri 
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np  Chart 

for  ceramic$defectives 


UCL 


CL 


LCL 


Group 

Number  of  groups  =  20 

Center  =  29  LCL  =  15.38714  Number  beyond  limits  =  1 

StdDev  =  4.537621  UCL  =  42.61286  Number  violating  runs  =  0 


Interpretation: 

We  learnt  the  R  coding  for  construction  of  np  chart  for  the  given  data. 

The  control  limit  for  np  chart  are  LCL  =  15.387  and  UCL  =  42.612.  From  the  control  chart,  the  19th 
sampling  observation  is  falling  outside  of  the  UCL  and  LCL.  So  we  conclude  that  the  process  is  out 
of  control. 
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P  Chart 


Exercise:  45 
Date:  2.03.2020 


Consider  the  data  on  Number  of  defectives  in  the  20  independent  samples  of  varying  sizes  from 
production  process. 


Sample 

i 

2 

3 

4 

5 

6 

7 

8 

9 

10 

Defectives 

33 

37 

21 

39 

18 

20 

35 

41 

33 

37 

n 

80 

95 

65 

90 

85 

80 

75 

80 

90 

90 

Sample 

ii 

12 

13 

14 

15 

16 

17 

18 

19 

20 

Defectives 

25 

41 

24 

30 

31 

19 

35 

27 

15 

19 

n 

100 

110 

85 

95 

85 

95 

100 

110 

85 

90 

Construct  control  chart  for  number  of  defectives  based  on  the  given  data. 


Aim: 


To  create  R  coding  to  construct  p  chart  for  the  given  data. 

R  code  and  Output: 


>  library (qcc) 

>  # - P  chart - # 

>  def ectives<-c ( 33 r 37,21, 39 r 13,20, 35 r 41, 33, 37,25, 41,24, 30, 31, IS r 35,27, 15 r 19} 

>  samplesize<-c {80,95,65,90,35,80,75,80,90,90,100,110,85,95,85,95,100,110,85,90) 

>  pd<-data . frame (defectives,  sampiesize) 

>  qcc (pd$def ectives,  sizes=pd$ sampiesize ,  type=npn  ) 

List  of  11 

$  call  :  language  qcc(data  =  pd^def ectives ,  type  =  "p",  sizes  =  pd^sampiesize } 

$  type  :  chi  "p" 

$  data .  name  :  chi  "pd$def  ectives" 

$  data  :  nun  [1:20,  1]  33  37  21  39  IS  20  35  41  33  37  ... 

" dimname s rT )  =Li s t  of  2 

Named  num  [1:20]  0.412  0.3S9  0.323  0.433  0.212  ... 

"names") =  chi  [1:20]  "1"  "2"  "3"  "4"  ... 

nun  [1:20]  SO  95  55  90  85  SO  75  SO  90  90  ... 


call 

type 

data .name 
data 

. -  atti  ( *, 
statistics 
. -  atti  ( *, 
sizes 
cent e i 
std. dev 
nsigmas 
limits 
. -  atti 


nun 

nun 


0.325 

0.468 

3 


num:  [1:20,  1:2]  0.168 
" diirname s  " }  =Li s t  of  2 


0.181  0.151  0.177  0.173 


violations : List  of  2 
atti  [*,  "class") =  chi 


'rqccn 
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p  Chart 

for  pdSdefectives 


UCL 


CL 


LCL 


Group 

Number  of  groups  -  20 

Center  =  0.32493  LCL  is  variable  Number  beyond  limits  =  1 

StdDev  -  0.4683487  UCL  is  variable  Number  violating  runs  -  0 


Interpretation: 

We  learnt  the  R  coding  for  construction  of  p  chart  for  the  given  data. 

From  the  control  chart,  the  8th  sampling  observation  is  falling  out  side  of  the  UCL.  So  we  conclude 
that  the  process  is  out  of  control. 
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c  Chart 


Exercise:  46 
Date:  2.03.2020 


Number  of  defects  in  samples  of  five  Printed  circuit  Boards  is  given  below.  Construct  control  chart 
for  total  number  of  defects  in  a  sample  and  interpret. 


Sample 

i 

2 

3 

4 

5 

6 

7 

8 

9 

10 

Defectives 

6 

4 

8 

10 

9 

12 

16 

2 

3 

10 

n 

50 

50 

50 

50 

50 

50 

50 

50 

50 

50 

Sample 

ii 

12 

13 

14 

15 

16 

17 

18 

19 

20 

Defectives 

9 

15 

8 

10 

8 

2 

7 

1 

7 

13 

n 

50 

50 

50 

50 

50 

50 

50 

50 

50 

50 

AIM: 


To  create  R  coding  to  construct  c  chart  for  the  given  data. 

R  code  and  Output: 


>  library (qcc) 

>  # - c  chart - 

>  def ects<-c 4, 8, 10, 9, 12, 16, 2, 3, 10, 9, 15, 8, 10, 8, 2, 7, 1, 7, 13) 

>  samplesize_l<-c (rep (SO, 20) ) 

>  circuit<-data . frame {def ects, samplesizel) 

>  qcc (circuit£defects,  3izes=circuit$saiiiple3ize, type=wc",  title 
List  of  11 

$  call  :  language  qcc (data  =  circuitSdefects,  type  =  "c", 

chi  "c" 

chi  "circuit$defects" 

num  11:20,  1]  6 -4  8  10  9  12  16  2  3  10  ... 
"dimnames")=List  of  2 

Named  mm  11:20]  6  4  8  10  9  12  16  2  3  10  ... 
"names" )=  chi  [1:20]  "1"  "2"  "3"  "4"  ... 
mm  11:20]  50  50  50  50  50  50  50  50  50  50  ... 
num  8 
num  2.83 
num  3 

mm  (1,  1:2]  0  16.5 
"dimnames"}=List  of  2 


5  type  : 

$  data. name  : 

$  data  : 

. . -  atti  [*, 

$  statistics: 

. . -  atti  (*, 

$  sizes  : 

$  centei  : 

$  std.dev  : 

$  nsigmas  : 

$  limits  : 

. .-  atti  (*, 

$  violations: List  of  2 
-  atti  {*,  "class") =  chi 


rqcc" 


■# 


=  "c  chart  for  circuit  based  defective") 

sizes  =  ciicuitSsamplesize,  title  =  "c  chart  for  circuit  based  defective") 
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c  chart  for  circuit  based  defective 


UCL 


CL 


LCL 


Group 

Number  of  groups  =  20 

Center  -  8  LCL  -  0  Number  beyond  limits  -  0 

StdDev  =  2.828427  UCL  =  16.48528  Number  violating  runs  =  0 


Interpretation: 

We  learnt  the  R  coding  to  construct  c  chart  for  the  given  data. 

The  control  limits  for  c  chart  are  LCL  =  0  and  UCL  =16.485.  From  the  control  chart,  all  the  points 
are  falling  within  the  control  limits  and  we’ve  found  no  pattern  or  violating  run.  So  we  conclude  that 
the  process  is  in  control. 
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U  Chart 


Exercise:  47 
Date:  2.03.2020 


Number  of  defects  in  samples  of  five  Printed  circuit  Boards  is  given  below.  Construct  control  chart 
for  total  number  of  defects  per  unit  in  a  sample  and  interpret. 


Sample 

i 

2 

3 

4 

5 

6 

7 

8 

9 

10 

Defectives 

6 

4 

8 

10 

9 

12 

16 

2 

3 

10 

n 

50 

50 

50 

50 

50 

50 

50 

50 

50 

50 

Sample 

ii 

12 

13 

14 

15 

16 

17 

18 

19 

20 

Defectives 

9 

15 

8 

10 

8 

2 

7 

1 

7 

13 

n 

50 

50 

50 

50 

50 

50 

50 

50 

50 

50 

Aim: 


To  create  R  coding  to  construct  u  chart  for  the  given  data. 

R  code  and  Output: 


>  library(qcc) 

>  # - U  chart - # 

>  defects<-c(6,4,3,10,9,12,16,2,3,lD,9,15,3,10,3,2,7,i,7,13) 

>  samplesize_l<-c (rep (50,20) ) 

>  circuit<-data. frame (defects, samplesize_l> 

>  qcc (circuit$defects,  sizes=circuit$samplesize, type="u",  title  =  "U  chart  for  circuit  board  defectives") 

List  of  11 

$  call  :  language  qcc(data  =  cizcuit$defects,  type  =  "u",  sizes  =  circuit$samplesize,  title  =  "U  chart  for  circuit  board  defectives") 

$  type  :  chz  "u" 

$  data. name  :  chz  "circuitSdefects" 

$  data  :  num  [1:20,  1]  6  4  8  10  9  12  16  2  3  10  ... 

attr[*,  "dimnames") =List  of  2 

$  statistics:  Named  num  [1:20]  0.12  0.08  0.16  0.2  0.18  0.24  0.32  0.04  0.06  0.2  ... 

attr  (*,  "names,r}  =  chr  [1:20]  "1"  "2"  "3"  "4"  ... 

$  sizes  :  num  [1:20]  50  50  50  50  50  50  50  50  50  50  ... 

$  center  :  num  0.16 

$  std.dev  :  num  0.4 

$  nsigmas  :  num  3 

$  limits  :  num  [1,  1:2]  0  0.33 
attz[*,  "dimnames")=List  of  2 
$  violations: List  of  2 
-  attr  [*r  "class,r}=  chr  "qcc" 
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li  chart  for  circuit  board  defectives 


UCL 


CL 


LCL 


Group 

Number  of  groups  -  20 

Center  =  0.16  LCL  =  0  Number  beyond  limits  =  0 

StdDev  =  0.4  UCL  =  0.3297056  Number  violating  runs  =  0 


Interpretation: 

We  learnt  the  R  coding  to  construct  U  chart  for  the  given  data. 

The  control  limits  for  u  chart  are  LCL  =  0  and  UCL  =0.3297.  From  the  control  chart,  all  the  points 
are  falling  within  the  control  limits  and  we’ve  found  no  pattern  or  violating  run.  So  we  conclude  that 
the  process  is  in  control. 
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COMPARISON  OF  SHEWART  AND  CUSUM  CHART 


Exercise:  48 
Date:  4.03.2020 

For  the  following  quality  characteristics  construct  Shewart  and  Cusum  control  chart  and  give  your 
interpretations 


9.45 

7.99 

9.29 

11.66 

12.16 

10.18 

8.04 

11.46 

9.2 

10.34 

9.03 

11.47 

10.51 

9.4 

10.08 

9.37 

10.62 

10.31 

8.52 

10.84 

10.9 

9.33 

12.29 

11.5 

10.6 

11.08 

10.38 

11.62 

11.31 

10.52 

Aim: 


To  create  R  coding  to  compare  the  shewart  and  cusum  control  chart  for  the  given  data. 

R  code  and  Output: 


>  # Shewart  and  cusum# 

>  library (qcc) 

>  qualitychaz<-c (5. 45, 7. 59, 5. 25, 11. 66, 12. 16, 10. 13, 3. 04, 11. 46, 5. 2, 10. 34, 5. 03, 11. 47, ID. 51, 5. 4, 10. 03, 5. 37, 10. 62, 10. 31, 3. 52, 10. 34, 10. 5 
+  ,5.33,12.25,11.5,10.6,11.03,10.33,11.62,11.31,10.52) 

>  qcc (qualitychar,  type=wJEbar .  one\nsigma=3) 

List  of  11 

$  call  :  language  qcc (data  =  qualitychar,  type  =  "star. one",  nsigmas  =  3) 

$  type  :  chi  "xfcai.one" 

$  data. name  :  chi  "qualitychar" 

$  data  :  num  [1:30,  1]  9.45  7.55  5.25  11.66  12.16  ... 

attr  [*,  "dimnam.es  ")=Li  st  of  2 

$  statistics:  Named  num  [1:30]  9.45  7.99  9.25  11.66  12.16  ... 

..-  attr  [*,  "names" }=  chr  [1:30]  "1"  "2"  "3"  "4"  ... 

$  sizes  :  int  [1:30]  1111111111  ... 

$  center  :  num  10.3 

$  std.dev  :  num  1.2 

$  nsigmas  :  num,  3 

$  limits  :  num  [1,  1:2]  6.72  13.91 
attr  [*,  "dimnames")=List  of  2 
$  violations: List  of  2 
-  attr(*,  "class")=  chr  "qcc" 
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xbar.one  Chart 
for  qualitychar 


Cfl 

o 

U — ■ 

'-I — 1 

03 

-i — ■ 

& 

(0 


Z5 

W 

CL 

=5 

O 

<5 


UCL 


CL 


LCL 


Number  of  groups  -  30 
Center  =  10.315 
StdDev=  1.199865 


Group 

LCL  =  6.71 5404 
UCL  =  13.9146 


Number  beyond  limits  =  0 
Number  violating  runs  =  2 


>  cusum (qualitychar F  decision . inteival=  5F  std.dev=lF  center  =  10  F  sizes  =  1} 

List  of  14 

$  call  :  language  cusum  [data  =  qualitychar F  sizes  =  1F  center  =  10 F  std.dev  =  1F  decision . interval 

$  type  :  chr  "cusum" 

$  data. name  :  chr  "qualitychar rr 

$  data  :  nm  [1:3QF  1]  5.45  7.55  5.25  11.66  12.16  ... 

..-  attr(*F  "dimnames")=List  of  2 

$  statistics  :  Named  num  [1:30]  5.45  7.55  5.25  11.66  12.16  ... 

..-  attr  ( * r  "names") =  chr  [1:30]  "1"  "2"  "3"  "4"  ... 


5  sizes 
$  center 
$  std.dev 
$  pos 
£  neg 

$  head. start 
£  decision. interval 
$  se. shift 
$  violations 


num  [1:30]  11  11111111  ... 
num  10 
num  1 

num  [1:30]  000  1.16  2.82  ... 


num  [1:30] 
num  0 
num  5 
num  1 
List  of  2 


-0.05  -1.56  -1.77  0  0  ... 


I 


attr(*F  "class")=  chr  "cusum. qccn 


AJEETH.H  |  17-UST-046  |  PG  &  RESEARCH  DEPARTMENT  OF  STATISTICS,  LOYOLA  COLLEGE,  CHENNAI-34 


92 


cusum  Chart 
for  qualitychar 


E 

Z5 

to 

<D 

> 

-t — 1 

=5 

E 

=! 

o 


UDB 


LDB 


1  3  5  7  9  11  14  17  20  23  26  29 


Group 

Number  of  groups  -  30  Decision  interval  (std.  err.)  -  5 

Center  =  1 0  Shift  detection  (std.  err.)  =  1 

StdDev  =  1  No.  of  points  beyond  boundaries  =  2 


Interpretation: 

We  learnt  the  R  coding  to  compare  the  Shewart  and  Cusum  control  chart  for  the  given 
data. 

The  shewart  control  chart  identifies  a  possible  shift  in  the  process  based  on  the  runs  but  still 
it  does  not  provide  strong  evidence.  The  Cusum  control  chart  provides  evidence  of  process 
shift  based  on  points  lying  outside  upper  control  limits. 
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EXPONENTIAL  MOVING  AVERAGE  CONTROL  CHART 


Exercise:  49 
Date:  6.03.2020 


For  the  following  quality  characteristics  construct  Exponential  Moving  average  control  chart  and 
give  your  interpretation. 


9.45 

7.99 

9.29 

11.66 

12.16 

10.18 

8.04 

11.46 

9.2 

10.34 

9.03 

11.47 

10.51 

9.4 

10.08 

9.37 

10.62 

10.31 

8.52 

10.9 

9.33 

12.29 

11.5 

10.6 

11.08 

10.38 

11.62 

11.31 

10.52 

Aim: 

To  create  R  codings  to  construct  exponential  moving  average  control  charts  for  the  given  data 

R  Code  &  Output 


>  # - Exponential  Moving  Average  Chart - # 

>  library (qcc) 


/  _  |/  /  |  Quality  Control  Charts  and 

|  (_|  |  (_ |  ( _  Statistical  Frocess  Control 

\_  I  \ _ \ _ 1 

|_ |  version  2.7 

Type  1  citation  ( rr  qcc rr)  1  for  citing  this  R  package  in  publications. 

>  Quail tyCharac ter is tics<-c (9 . 45 r  7 . 99, 9 . 29 r 11 . 66, 12 . 16, 10 . 18 r  3 . 04, 11 . 46,  9 . 2  r 10 . 34, 
-  9.03,11.-37,9.3, 10. 03,9.37,10.31 ,3. 52 ,10. 34 ,10. 9 ,9.33 ,12 .29,11.5,10.6,11.03, 

+  10.33,11.62,11.31,10.52) 

>  Quail tyc-ewma  (Qualit ^Characteristics ,  iamda=0 . 1 ,  r.sigina=2 . 7  ,  std .  dev=T ,  cer.ter=10 ) 

>  I 
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EWMA  Chart 

for  QualityCharacteristics 


UCL 


LCL 


Number  of  groups  =  28 
Center  =  1 0 
StdDev  =  1 


Group 

Smoothing  parameter  -  0.2 
Control  limits  at  2.7*sigma 
No.  of  points  beyond  limits  =  1 


Interpretation: 

We  learned  R  codings  for  exponential  weighted  moving  average  control  charts  to  the  given  data. 

EWMA  control  chart  provides  evidence  for  shift  in  process  mean  for  the  given  characteristics  and 
hence  further  inspection  of  the  person. 
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PROCESS  CAPABILITY  ANALYSIS  -I 

Exercise:  50 
Date:  6.03.2020 

Construct  Cp,Cpl,  Cpu,  Cpk  and  Cpm  for  the  following  data  and  interpret  on  process  control. 


Sample 

XI 

X2 

X3 

X4 

X5 

1 

83 

79 

81 

82 

83 

2 

83 

81 

85 

87 

81 

3 

85 

87 

83 

84 

86 

4 

80 

81 

83 

84 

83 

5 

83 

84 

85 

83 

84 

6 

88 

87 

89 

90 

88 

7 

80 

81 

82 

84 

81 

8 

79 

89 

88 

89 

89 

9 

78 

83 

85 

86 

93 

10 

88 

83 

82 

85 

82 

11 

78 

80 

78 

82 

81 

12 

81 

85 

85 

85 

84 

13 

77 

82 

84 

85 

87 

14 

81 

85 

85 

85 

84 

15 

85 

87 

82 

85 

89 

16 

83 

83 

77 

81 

80 

17 

85 

84 

84 

80 

82 

18 

82 

83 

80 

80 

83 

19 

75 

77 

84 

77 

78 

20 

85 

85 

86 

83 

80 

Aim: 

To  create  r  coding  to  construct  process  capability  analysis  control  charts  for  the  given  data 

R  Code  and  Output: 
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>  1 Lb  rary ( qc  c ) 

>  data<-file . choose ( ) 

>  datachart<-read. csv (data) 

>  p  i  s  t  on<  -da  t  a  ch.  art[,-l| 

>  q< - qc c  ( p  i  s  t  on  r  t  yp e=  n xb ainfr.si gma s=3  f  p I o t=  F ) 

>  process. c  ap  ab x  1  i  t  y  ( q  r  sp  e  c . I ini t  s= c ( 3  0  r  3  4 )  ) 

Frocess  Capability  Analysis 


Call : 

process,  cap  ab  i  1  i  t  y  { ob  j  ect  =  q,  sp  e  c  .  1  imi  t  s  =  c  (SO,  84)} 


Numb  e  r 

of  obs 

=  100 

Target  =  32 

Center 

=  83.23 

LSL  =  30 

StdDev 

=  2.494 

U5L  =  34 

Capability  indices : 

Value 

2 . 5% 

97 . 5% 

Cp 

0 . 26736 

0 . 23015 

0 . 3045 

Cp_l 

0 . 43846 

0 . 36341 

0 . 5135 

Cp  u 

0 . 09625 

0 . 04028 

0 . 1522 

Cp  k 

0 . 09625 

0 . 02955 

0 . 1629 

Cpm 

0 . 23785 

0 . 20165 

0 . 2740 

ExpCLSL  9.4% 
Exp>USL  39% 


>  I 


Dbs<L5L  11% 
Obs>USL  36% 


Process  Capability  Analysis 
for  piston 


75  80  85  90  95 


Number  of  obs  =  100  Target  =  82  Cp  =  0.267  Exp<tSL,  9.4% 

Center  =  83.28  LSL  =  80  CpJ  =  0.438  Exp>USL  39% 

StdDev  =  2.493551  U3L  =  84  Cp_u  =  0.0962  Obs<LSL11% 

Cp_k  =  0.0962  Obs>USL  36% 

Cpm  =  0.238 
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Interpretation: 

We  learned  R  codings  to  analysis  the  process  by  Process  capability  analysis  by  Cp,Cpl,  Cpu,  Cpk 
and  Cpm. 

The  Cp,  Cpl,  Cpu,  Cpk  and  Cpm  all  are  under  unity.  Therefore  we  conclude  that  the  process  is 
not  in  capable  condition  and  immediate  action  should  be  taken  to  overcome  this  situation. 


Cp  = 


USL-LSL 
6  a 


CPL  = 


X  -LSL 
3a 


CpU  = 


LSL-  X 
3  a 


Cpk  =  Min(CpL,  CPU) 


r  _  Cp 

pm  Vi+^ 
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PROCESS  CAPABILITY  ANALYSIS  -  II 


Exercise:  51 
Date:  10.03.2020 


Construct  Cp,  Cpl,  Cpu,  Cpk,  Cpm  for  the  following  data  and  interpret  on  process  control. 


Hours 

XI 

X2 

X3 

X4 

X5 

1 

5.03 

5.06 

4.86 

4.90 

4.95 

2 

4.97 

4.94 

5.09 

4.78 

4.88 

3 

5.02 

4.98 

4.94 

4.95 

4.80 

4 

4.97 

4.93 

4.90 

4.92 

4.96 

5 

5.01 

4.99 

4.93 

5.06 

5.01 

6 

5.00 

4.95 

5.10 

4.85 

4.91 

7 

4.94 

4.91 

5.05 

5.07 

4.88 

8 

5.00 

4.98 

5.05 

4.96 

4.97 

9 

4.99 

5.01 

4.93 

5.10 

4.98 

10 

5.03 

4.98 

4.92 

5.01 

4.93 

11 

5.02 

4.88 

5.00 

4.98 

5.09 

12 

5.09 

5.01 

5.13 

4.89 

5.02 

13 

4.90 

4.93 

4.97 

4.98 

5.12 

14 

5.04 

4.96 

5.15 

5.04 

5.02 

15 

5.09 

4.90 

5.04 

5.19 

5.03 

16 

5.10 

5.01 

5.04 

5.05 

5.02 

17 

4.97 

5.10 

5.12 

4.92 

5.04 

18 

5.01 

4.99 

5.06 

5.04 

5.12 
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R  code  and  output: 

>  library(qcc) 

>  datac-f lie . choose ( ) 

>  datac-read. csv (data) 

>  bear ing  3  < -da  t  a [ F - 1 ] 

>  q<-qcc  (bearings  r  type=wxbarw r nsigmas=3 r plot=F) 

>  process,  capability (qf  spec.  limits=c  (4 . 85, 5 . 05)  ) 

Process  Capability  Analysis 


Call: 

process  .  capability  (object  =  q,  spec  .  liir.its  =  c(4.85,  5.05}} 


Nuirber  of  obs  =  90 

Center  =  4.994 
StdDev  =  0.07882 


Target  =4.95 
LSL  =4.85 
U5L  =5.05 


Capability  indices: 


Cp 

Value 

0.4229 

Cp_l 

0.6085 

Cp  u 

0.2373 

Cp_k 

0.2373 

Cpir. 

0.3695 

Exp<LSL  3.4% 

Exp>USL  24% 

2.5% 

97.5% 

0.3608 

0.4849 

0.5138 

0.7032 

0.1725 

0.3021 

0.1601 

0.3145 

0.3095 

0.4293 

Obs<LSL  2.2% 

Obs>U5L  20% 
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Process  Capability  Analysis 
for  bearings 

L.SL_ Target_ USL 


4.8 

4.9  5.0 

5.1 

5.2 

Number  of  obs  =  90 

Target  =  4.95 

Cp 

=  0.423 

Exp<LSL  3.4% 

Center  =  4.993889 

LSL  =  4.85 

CpJ 

=  0.609 

Exp>USL  24% 

StdDev  =  0.07881915 

USL  =  5.05 

Cp_u 

=  0.237 

Obs<LSL  2.2% 

Cp_k 

=  0.237 

Obs>USL  20% 

Cpm 

=  0.369 

Interpretation: 

We  learned  R  coding  to  analysis  the  process  by  Process  Capability  Analysis. 

The  Cp,  CpI,  Cpu,  CpK  and  Cpm  all  are  under  unity.  Therefore  we  conclude  that  the  process  is  not 
in  capable  condition  and  immediate  action  should  be  taken  to  overcome  this  situation. 
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OPERATING  CHARACTERISTIC  CURVE 


Exercise:  52 
Date:  10.03.2020 


Construct  OC  curve  based  on  X-bar  chart  for  sample  sizes  n  =  1,  5,  10,  15  and  20 


Sample 

XI 

X2 

X3 

X4 

X5 

1 

83 

79 

81 

82 

83 

2 

83 

81 

85 

87 

81 

3 

85 

87 

83 

84 

86 

4 

80 

81 

83 

84 

83 

5 

83 

84 

85 

83 

84 

6 

88 

87 

89 

90 

88 

7 

80 

81 

82 

84 

81 

8 

79 

89 

88 

89 

89 

9 

78 

83 

85 

86 

93 

10 

88 

83 

82 

85 

82 

11 

78 

80 

78 

82 

81 

12 

81 

85 

85 

85 

84 

13 

77 

82 

84 

85 

87 

14 

81 

85 

85 

85 

84 

15 

85 

87 

82 

85 

89 

16 

83 

83 

77 

81 

80 

17 

85 

84 

84 

80 

82 

18 

82 

83 

80 

80 

83 

19 

75 

77 

84 

77 

78 

20 

85 

85 

86 

83 

80 
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R  Code  and  Output: 


library (qcc) 

# - OC  Curve - 

data=read. csv (file . choose ( ) ) 
piston<-data [ r -1] 

beta<-oc  .  curves  .  xbar  (qcc  (piston  f  t^e=nxbarn  f  nsigma=3r  plot=F)  ) 
print (round (beta f digits  *4)) 
sairple  size 


dev) 

n=5 

n=l 

n=10 

n=15 

n=20 

0 

0.9973 

0.9973 

0.9973 

0.9973 

0.9973 

0.05 

0.9971 

0.9973 

0.9970 

0.9968 

0 . 9966 

0.1 

0.9966 

0.9972 

0.9959 

0.9952 

0.9944 

0.15 

0.9957 

0 . 9970 

0.9940 

0.9920 

0.9900 

0.2 

0.9944 

0 . 9  9  6  8 

0.9909 

0.9865 

0.9823 

0.25 

0.9925 

0.9964 

0.9864 

0.9789 

0.9701 

0.3 

0.9900 

0 . 9960 

0.9798 

0 . 9670 

0.9514 

0.35 

0.9866 

0.9956 

0.9708 

0.9500 

0.9243 

0.4 

0.9823 

0.9950 

0.9586 

0.9266 

0.8871 

0.45 

0 . 9769 

0 . 9943 

0. 9426 

0.8957 

0.8383 

0.5 

0.9701 

0.9936 

0.9220 

0.8562 

0.7775 

0.55 

0.9616 

0.9927 

0.8963 

0.8078 

0.7055 

0.6 

0.9514 

0.9916 

0.8649 

0.7505 

0.6243 

0.65 

0 . 9390 

0.9905 

0.8275 

0.6853 

0.5371 

0.7 

0.9243 

0.9892 

0.7842 

0.6137 

0.4481 

0.75 

0.9071 

0.9877 

0.7351 

0.5379 

0.3616 

0.8 

0.8871 

0.9860 

0.6809 

0.4608 

0.2817 

0.85 

0.8642 

0.9842 

0.6225 

0.3851 

0.2115 

0.9 

0.8383 

0.9821 

0.5612 

0.3136 

0.1527 

0.95 

0.8094 

0.9798 

0.4983 

0.2485 

0.1059 

1 

0.7775 

0.9772 

0.4355 

0.1913 

0.0705 

1.05 

0.7428 

0.9744 

0.3743 

0.1431 

0.0450 

1.1 

0.7055 

0.9713 

0.3161 

0.1038 

0.0275 

1.15 

0.6659 

0.9678 

0 . 2622 

0.0730 

0.0161 

1.2 

0.6243 

0.9641 

0.2134 

0.0497 

0.0090 

1.25 

0.5812 

0.9599 

0.1703 

0.0328 

0.0048 

1.3 

0.5371 

0.9554 

0.1333 

0.0209 

0.0024 

1.35 

0.4525 

0.9505 

0.1022 

0.0125 

0.0012 

1.4 

0.4481 

0.9452 

0.0768 

0.0077 

0.0006 

1.45 

0.4043 

0 . 9394 

0.0564 

0.0045 

0.0002 
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OC  curves  for  xbar  Chart 


Interpretation: 

We  learnt  the  R  coding  for  constructing  OC  curve. 

From  the  OC  curve,  we  can  conclude  that  for  sample  of  size  providess  an  acceptable  level  of  beta 
error  for  detecting  a  shift  in  process,  From  a  sample  size  20  with  5%  or  less  than  5%  defectives  can 
be  accepted  with  100%  probability  and  percent  defectives  above  5  can  be  rejected  or  accepted  with 
0%  probability. 
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